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1. Introduction / Outline; Note for 2" Edition; 
Unresolved Problems 


Who/ How/What, “Tech. Index”, Messages, Personal Note 


1. For Whom is This Book Written? 


This book is primarily for PhD scientists and engineers who want to learn about 
quantitative finance, and for finance graduate students’. Practicing “quants” and 
academic research workers will find topics of interest. There are even essays with 
no equations for non-technical managers. 


2. How Can This Book Benefit You? 


This book will enable you to gain an understanding of practical and theoretical 
quantitative finance and risk management. 


3. What is In This Book? 


The book is a combination of a practical “how it’s done” book, a textbook, and a 
research book. It contains techniques and results for quantitative problems with 
which I have dealt in the trenches for many years as a quant on Wall Street. Each 
topic is treated as a unit, sometimes drilling way down. Related topics are 
presented in parallel, because that is how the real world works. An informal style 
is used to convey a picture of reality. There are even some stories. 


4. What is the “Tech. Index”? What Technical Background is Needed? 


The “Tech. Index” for each chapter is a relative index for this book lying between 
1-10 and indicating technical or mathematical sophistication. The average index 


' History: The book is an outgrowth of my tutorial on Risk Management given annually 
for five successive years (1996-2000) at the Conference on Intelligence in Financial 
Engineering (CIFEr), organized jointly by the IEEE and IAFE (now called IAQF). The 
attendees comprised roughly 50% quantitative analysts holding jobs in finance and 50% 
PhD scientists or engineers interested in quantitative finance. The chapter on Climate 
Change Risk Management resulted from a substantial involvement with climate change. 
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is 5. An index 1-3 requires almost no math, while 8-10 assumes a PhD . No 
background in finance is assumed, but some would definitely be helpful. 


5. How Should You Read This Book? What is in the Footnotes? 


You can choose topics that interest you. Chapters are self-contained. The 
footnotes add depth and interesting commentary. 


6. Message to Non-Technical Managers 


Parts of this book will help you get a better understanding of quantitative issues. 
Important chapters have discussions of systems, models, and data. Skip sections 
with equations (maybe read chapters with the Tech. Index up to 3). 


7. Message to Students 


You will learn quantitative techniques better if you work through derivations on 
your own, including performing calculations, programming and reflection. The 
mathematician George Polya gave some good advice: "The best way to learn 
anything is to discover it by yourself". Bon voyage. 


8. Message to PhD Scientists and Engineers 


While the presentation 1s aimed at being self-contained, financial products are 
extensive. Reading a finance textbook in parallel would be a good idea. 


9. Message to Professors 


Part of the book can be used in an advanced MS or PhD finance course (Tech. 
Index up to 8)’, or for MBAs (Tech. Index up to 5). Topics you may find of 
interest include: (1) Feynman path integrals and Green functions for options, (2) 
The Macro-Micro trend-risk model with explicit time scales connecting to both 
macroeconomics and finance, (3) Optimally stressed correlation matrices, (4) 
Stressed VAR, (5) Smart Monte Carlo, (6) Market crisis modeling, (7) 
Psychology and options models, (8) Climate-Change Risk. 


10. A Personal Note 


This book is largely based on my own work and/or first-hand experience. It is in 
part retrospective, looking back over trails traversed and sometimes blazed. Some 
results are in my 1988-89 CNRS preprints when I was on leave from the CNRS 
as the head of the Quantitative Analysis Group at Merrill Lynch, in my 1993 
SIAM Conference talk, and in my CIFEr tutorials. Footnotes entitled “History” 


? The book was used for a course I taught at the Courant Institute, NYU. 
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contain dates when my calculations were done over the years, along with 
recollections and stories’. 


Summary Outline: Book Contents 


The book consists of seven divisions. 


I. Qualitative Overview of Risk 


A qualitative overview of risk is presented, plus an instructive and amusing 
exercise emphasizing communication. 


II. Risk Lab for Derivatives (Nuts and Bolts of Risk Management) 


The “Risk Lab" first examines equity and FX options, including skew. Then 
interest rate curves, swaps, bonds, caps, and swaptions are discussed. Practical 
risk management including portfolio aggregation is discussed, along with static 
and time-dependent scenario analyses. 

This is standard textbook material, and directly relevant for basic quantitative 
work. 


III. Exotics, Deals, and Case Studies 


Topics include barriers, double barriers, hybrids, average options, the Viacom 
CVR, DECs, contingent caps, yield-curve options, reloads, index-amortizing 
swaps, and various other exotics and products. 

By now, this is mostly standard material. The techniques presented in the 
case studies are generally useful, and would be applicable in other situations. 


IV. Quantitative Risk Management 


Topics include optimally stressed positive-definite correlation matrices, fat-tail 
volatility, Plain/Stressed/Enhanced VAR, CVAR uncertainty, credit issuer and 
counterparty risk, model issues and quality assurance, systems issues and 
strategic computing, data issues, the Wishart Theorem, economic capital, and 
unused-limits risk. This is the largest of the six divisions of the book. 

Much of this material is standard, although there are various improvements 
and innovations. 


? History: My quant finance/risk positions were VP Manager at Merrill Lynch (1987-89); 
Director at Eurobrokers (1989-90), Director at Fuji Capital Markets Corp. (1990-93), VP 
at Citibank (1993), Director at Smith Barney/Salomon Smith Barney/Citigroup (1993- 
2003), Director at Moore Capital Management (2007-08), and head of Strategic Risk 
Research / Quant Risk Analytics at Bloomberg LP (2010-15). 
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V. Path Integrals, Green Functions, and Options 


Feynman path integrals provide an explicit and straightforward method for 
evaluating financial products, e.g. options. The simplicity of the path integral 
technique avoids mathematical obscurity. My original applications of path 
integrals and Green functions to options are presented, including pedagogical 
examples, mean-reverting Gaussian dynamics, memory effects, multiple 
variables, and two related straightforward proofs of Girsanov’s theorem. 
Consistency with the stochastic equations is emphasized. Numerical aspects are 
treated, including the Castresana-Hogan path-integral discretization. Critical 
exponents and the nonlinear-diffusion Reggeon Field Theory RFT are discussed. 
Recent empirical connections of the RFT with markets in crises are presented. 
The results by now are all known. The presentation is not standard. 


VI. Trend Risk and The Macro-Micro Model 


Trend risk is the fourth generic risk (separate from volatility risk, correlation risk, 
and jump risk). The problem is “sliding into the mud", not covered by the cost of 
doing business, as in the transition into a recession. The Macro-Micro model has 
built-in trend risk. Initially, as developed with A. Beilis, it originated through an 
examination of models capable of reproducing yield-curve dynamical behavior — 
in a word, producing yield-curve movements that look like real data. The real 
feature of this model is that it has time scales with different dynamic behavior. 
The model contains separate mechanisms for long-term and short-term behaviors 
of rates. The model is connected in principle with macroeconomics through 
quasi-random quasi-equilibrium paths, and it is connected with financial models 
through strong mean-reverting dynamics for fluctuations through trading. 
Applications of the Macro-Micro model to the FX and equities markets are also 
presented, along with recent formal developments. Option pricing and no- 
arbitrage in the Macro-Micro framework are discussed. Finally a “function 
toolkit”, possibly useful for business cycles and/or trading, is presented. 
This material is not standard but should be. 


Note for the 2"* Edition (2015) 


A lot of water has passed under the bridge since the 1* edition, first printed in 
2004, including a financial crisis and increased emphasis on risk management. 
The material however continues to be relevant; I use it regularly (even items I 
thought were peripheral). The added material for the 2™ edition is largely new 
and hopefully will be of interest*”. 


^ History: There are now many quantitative finance textbooks and journals. No textbooks 
and almost no papers existed when I started working on quantitative finance in the late 
1980's along with other early quants. We worked it all out ourselves. 
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Climate Change Risk Management - NEW TOPIC 


Climate change risk management is the most significant addition to the 2™ 
edition of this book. Climate change with its global warming temperature trend 
presents increasing serious risks to business, the economy, finance, and society in 
general. I believe that the ensemble of increasingly serious climate impacts can 
potentially destabilize the inherently unstable worldwide financial and economic 
systems into deep crisis. 

I propose a formal structure for climate risk management, and also emphasize 
positive opportunities occasioned by mitigating climate change. I also propose 
two climate risk metrics: “Climate Change Value at Risk" and a “Climate Change 
Reward-to-Risk Ratio". I discuss a negative ethically based discount rate for 
valuation of future climate impacts, if we do not act sufficiently responsibly on 
climate mitigation. I also give a survey of climate issues (science, impacts, 
mitigation/adaptation, contrarians). 

I believe that we must be optimistic on working out a solution to the climate 
problem. There is no other choice. 


Evaluation — Where do we stand? What about the future? 


The current generation of quants and risk analysts set up a basically sound 
quantitative finance and risk management structure since the late 1980's, starting 
from basically no quantitative structure. This has required an immense effort, 
also including many people across many disciplines. Still, my opinion is that risk 
management of finance is essentially still in an adolescent state, because our risk 
procedures are only effective at times when there are no really big risks. 


Unsolved Problems in Finance and Risk Management 


Here is a short list of what I believe are important basic unsolved problems. 
Advances will require increased understanding of these issues. 

1. The inherent basic instability of financial systems, including crises 
Psychology and finance 
Time scales and finance 
How to re-establishment connections between economics and finance 
Climate change risk management; climate effects on financial systems 


ыйы ч 


I hope that the next generation will take quantitative finance and risk 
management further than we have gone. Some advice: Look at the big picture. 
Bon voyage. 


‘References: Time limitations precluded substantial reference updates to the 2™ edition. 
The exception is Ch. 53 on Climate Change Risk Management, which contains up-to-date 
references. 
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2. Overview (Tech. Index 1/10) 


In this overview, we look at some general aspects of quantitative finance and risk 
management. There is also some advice that may be useful. A reminder: the 
footnotes in this book have interesting information. They function as sidebars, 
complementing the text. 


Objectives of Quantitative Finance and Risk Management 


The general goal of quantitative finance and risk management is to quantify the 
behavior of financial instruments today and under different possible 
environments in the future. This implies that we have some mathematical or 
empirical procedure of determining values of the instruments as well as their 
changes in value under various circumstances. While the road is long, and while 
there has been substantial progress, for many reasons this goal is only partially 
achievable in the end and must be tempered with good judgment. Especially 
problematic are the rare extreme events, which are difficult to characterize, but 
where most of the risk lies. 


Why is Quantitative Finance a Science? 


Outwardly, the quantitative nature of modern finance and risk management 
seems like a science. There are models that contain theoretical postulates and 
proceed along mathematical lines to produce equations valuing financial 
instruments. There are "experiments" which consist of looking at the market to 
determine values of financial instruments, and which provide input to the theory. 
Finally, there are computer systems, which keep track of all the instruments and 
tie everything together. 


Why is Quantitative Finance not a Science? 


In science there is real theory in the sense of Newton's 2" law (F = ma) backed 
by a large collection of experiments with high practical predictive power and 


' Why Read the Footnotes? Robert Karplus, the physicist who taught the graduate 
course in electromagnetism at Berkeley, said once that the most interesting part of a book 
is often in the footnotes. The footnotes are an integral part of this book. 
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known domains of applicability (e.g. objects not too small and not moving too 
fast). 

In contrast, financial theoretical "postulates", when examined closely, turn 
out to involve assumptions, which are at best only partially justifiable in the real 
world. The financial analogs to scientific "experiments" obtained by looking at 
the market are of limited value. Market information may be quite good, in which 
case not much theory is needed. If the market information is not very good, the 
finance theory is relatively unconstrained. Finance computer systems are always 
incomplete and behind schedule (this is a theorem). The technical knowledge of 
managers can be inversely related to their position. Somebody must be at the 
head of a big department, and this person may be from legal or sales. They know 
their universe inside and out, and they will not know the details of yours.’ 


Quantitative Finance is Not Science but Phenomenology 


The situation characterizing quantitative finance is really what physicists call 
"phenomenology". Even if we could know the "Newton laws of finance", the real 
world of finance is so complex that the consequences of these laws could not be 
evaluated with any precision. Instead, there are financial models and statistical 
arguments that are only partially constrained by the real world, and with 
unknown domains of applicability, except that they often break when the market 
conditions change in an extreme fashion. The main reason for this fragility is that 
human psychology and macroeconomics are fundamentally involved, yet not 
explicitly modeled. The worst cases for risk management, such as the onset of 
collective panic or the potential consequences of a deep recession, are impossible 
to quantify uniquely—extra assumptions tempered by judgment must be made. 
Risk management is very necessary, but unfortunately risk management mostly 
works best when there is not much risk.? 


What About Uncertainties in the Risk Itself? 


A characteristic showing why risk management is not science deals with the lack 
of quantification of the uncertainties in risk calculations and estimates. 
Uncertainty or error analysis is always done in scientific experiments. It is 
preferable to call this activity "uncertainty" analysis because "error" tends to 
conjure up human error. While human error should not be underestimated, the 
main problem in finance usually lies with uncertainties and incompleteness in the 


^ Having said that, upper managers may have qualitative insight into practical 
applications of your work, maybe better than you, and they can ask fruitful questions. 


> This is the origin of the need for regulation, capital buffers, audit groups, etc. Financial 
markets regularly crash after bubbles. Standard risk management only measures 
consequences of fluctuations small compared to crashes. Standard risk management also 
does not measure trend risk, e.g. sliding into recession, which we cover later in the book. 
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models and/or the data. Risk measurement is standard, but the quantification of 
uncertainty in the risk itself is usually ignored. 

We will discuss one example in determining the uncertainty in risk when we 
discuss the uncertainties in the components of risk (Component VARs) that lead 
to a given total risk (VAR) at a given statistical level. We hope that such 
measures of uncertainty will become more common in risk management. 

In finance, there is too often an unscientific accounting-type mentality. Some 
people do not understand why uncertainties should exist at all, tend to become ill 
tempered when confronted with them, and only reluctantly accept their existence. 
The situation is made worse by the unscientific and meaningless precision often 
used by risk managers and quants to quote risk results. Risk quantities that may 
have large inherent unspecified uncertainties are quoted to many decimal places. 
False confidence, misuse and misunderstanding can and does occur. A fruitless 
activity consists of attempting to explain why one result is different from another 
result under somewhat different circumstances, when the numerical uncertainties 
in all these results are unknown and potentially greater than the differences being 
examined. 


Tools of Quantitative Finance and Risk Management 


The main tools of quantitative finance and risk management are the models for 
valuing financial instruments and the computer systems containing the data 
corresponding to these instruments, along with recipes for generating future 
alternative possible financial environments and the ability to produce reports 
corresponding to the changes of the portfolios of instruments under the different 
environments, including statistical analyses. Risk managers then examine these 
reports, and corrective measures or new strategies are conceived. 


The Greeks 


The common first risk measures are the “Greeks”. These are the various low- 
order derivatives of the security prices with respect to the relevant underlying 
variables. The derivatives are performed either analytically or numerically. The 
Greeks include delta and gamma (first and second derivatives with respect to the 
underlying interest rates or spreads, stock price etc.), vega’ (first derivative with 
respect to volatility), etc. The Greeks are accurate enough for small moves of 
underlyings for day-to-day risk management. 


^ Vega the Greek? Who thought up this name? Vega does happen to be the 5" brightest 
star, but this 1s irrelevant. 
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Hedges 


Hedges are securities that offset risk of other securities. Knowledge of the hedges 
is critical for trading risk management. Say we have a position or a portfolio with 


value C depending on one or several variables fx, | (e.g. interest rates, FX rates, 
an equity index, prepayments, gold, ...). Say we want a hedge H depending on 
possibly different variables { Ув » Normally but not always, the trader will not 


hedge out the whole risk, because to do so he would have to sell exactly what he 
buys (“back-to-back”). Usually, there will be a decision, consistent with the 
limits for the desk, to hedge out only part of the risk. Hedging risk can go wrong 
in a number of ways. Generically, the following considerations need to be taken 
into account: 


1. The hedge variables { ys] may not be the same as the portfolio variables 


{x, \ , even if reasonably correlated. However, historically reasonable hedges can 


and do break down in periods of market instability where correlations can 
become “stressed” and losses can become high. 


2. Some of the hedge variables may have little to do with the portfolio variables, 
and so introduce a good deal of extra risk on their own. 


3. The hedge may be too costly to implement, not be available, etc. 


Scenarios (“What-if”, Historical, Statistical) 


In order to assess the severity of loss to large moves, scenario or statistical 
analyses are employed. A “what-if” scenario analysis will postulate, during a 
given future time frame, a set of changes of financial variables. A historical 
scenario analysis will take these changes from selected historical periods. A 
statistical analysis will use data, also from selected historical periods, and 
categorize the anomalous large moves (“fat tails") statistically. Especially 
important, though because of technical difficulties often overlooked, are changes 
in the correlations. We will deal with these issues in the book at length. 

Usually the scenarios are treated using a simplistic time-dependence, a quasi- 
static assumption. That is, a jump forward in time by a certain period 1s assumed, 
and at the end of this period, the changes in the variables are postulated to exist. 
The jump forward in time is generally zero (“immediate changes") or a short time 
period (e.g. 10 days for a standard definition of “Value at Risk"). The analysis 
can be improved by choosing different time periods corresponding to the 
liquidity characteristics of different products (short periods for liquid products 
easy to sell, longer periods for illiquid products hard to sell). The analysis can be 
improved even more by using liquidity estimates for markets in crisis. 
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Usually, the risk is determined for a portfolio fixed at a given point in time, 
with no change assumed for the risk measurement period. Scenarios can also 
involve assumptions about the future changes in the portfolios. For example, 
under stressed market conditions and losses, it might be postulated that a given 
business unit would sell a certain fraction of inventory, consistent with business 
objectives. Extra “liquidity” penalties can be assessed for selling into hostile 
markets. These require estimation of the action of other institutions, volumes, etc. 
The worst is an attitude similar to *I don't care what you think your buggy whip 
is worth, I won't pay that much" that leads to the bottom falling out of a market. 


Monte-Carlo Simulation 


A more sophisticated risk approach uses a Monte Carlo simulator, which 
generates possible *worlds" marching forward in time. Either a mathematical 
algorithm can be specified to generate the possible worlds, or different scenarios 
can be chosen with subjective probabilities. Such calculations have more 
assumptions (this is bad) but lead to more detailed information and avoid some of 
the crude approximations of the static analyses (this is good). 

For the most part, complete Monte-Carlo simulations (except for small 
portfolios or large portfolios on a limited basis) are an ongoing risk topic. 
Implementation of a large risk calculation requires a parallel-processing systems 
effort, as described in Ch. 35. 


Data and Risk 


Knowledge of the historical data changes in different time frames plays a large 
role in assessing risk, especially anomalously large moves as well as the 
magnitudes of moves at different statistical levels. It is also important to know 
the economic or market forces that existed at the time to get a subjective handle 
of the probability that such moves could reoccur. While “the past is no guarantee 
of the future", the truth is nonetheless that the past is the only past we have, and 
we cannot ignore the past. 


Problematic Topics of Risk: Models, Systems, Data 
A summary of topics treated in more detail in other parts of the book includes: 


1. Models: (Model Risk; Time Scales; Mean Reversion; Jumps and Nonlinear 
Diffusion; Trend Risk and Macro Quasi Random Behavior; Model Limitations; 
Which Model; Psychological Attitudes; Model Quality Assurance; Parameters). 


5 Traders sometimes do not believe risk management loss estimates because they think 
they can “trade out of the risk". This consists of loudly asserting that risky positions can 
be sold before getting hit by any big market move. In practice, that works except when it 
doesn't work, in which case the trader just packs up and moves to another job. 
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2. Systems: (What is a System?; Calculators; Traps; Communication; Birth and 
Development; Prototyping; Who’s in Control?; Mergers and Startups; Vendors; 
New Paradigms; Systems Solutions) 


3. Data: (Consistency, Reliability, Completeness, Vendors) 


The Traditional Areas of Risk Management 


Risk management is traditionally separated into Market Risk and Credit Risk. 
There is a growing concern with Operation Risk and Systemic Risk. In this 2™ 
edition we also add long-term risk from Climate Change. 


Market Risk 


Market risk is the risk due to the fluctuation of market variables. Market risk is 
separated out into functional business areas, including Interest Rates, Equities, 
FX, Emerging Markets, Commodities, etc. Further subdivisions include cash 
products (bonds, stocks), derivatives (plain vanilla, exotics), structured products 
(MBS, ABS), etc. Individual desks correspond to further detail (e.g. the desk for 
high-yield bonds). Each area will have its own risk management expertise 
requirements. We will spend a lot of time in this book discussing market risk. 

A corporate-level measure of market risk is called VAR (Value at Risk). We 
will discuss various levels of sophistication of VAR, ending with a quite 
sophisticated measure that in this book is called Enhanced/Stressed VAR*. 


Credit Risk 


Credit risk includes several areas, including traditional credit risk assessment of 
corporations, credit issuer risk, and counterparty risk. 

Traditional credit risk assessment by rating agencies involves financial 
statistics (balance sheet, cash flows), comparisons, and in-depth analysis. 

Credit issuer risk is the risk due to defaults and downgrades of issuers of 
securities, notably bonds’. We will discuss issuer risk in some detail in this book. 

Counterparty risk is due to nonperformance of counterparties to transactions. 
We will also discuss counterparty risk, including “wrong way risk”®.. 


* Regulators have adopted a simple version of Stressed VAR, for which they use the same 
term. The original meaning of “Stressed VAR” is given in this book and is considerably 
more sophisticated. The regulatory “Stressed VAR” is close to what the author earlier 
promoted as “Turbulent УАК”. 


7 Issuer risk is related to Incremental Risk Charge (IRC) and Specific Risk (SR), for 
which regulators have prescribed definitions. 
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Unified Market and Credit Risk Management? 


Market risk and credit risk are correlated. In times of bad markets, credit risk 
increases. Conversely, if credit risk is increasing because of economic weakness, 
the markets will not be bullish. Moreover, there are double counting issues. For 
example, market risk for credit products is determined using spread fluctuation 
data. There are technical market spread components and potential default spread 
components. We will see in Ch. 31 that spread movements can be distinguished 
for particular definitions of market and credit risk. However, it would be better if 
market and credit risk management were integrated. Unfortunately, the languages 
spoken by the two departments are largely disjoint and there can be legacy 
structural issues that hamper communication and integration. 


Operational Risk 


Operational risk deals with the risk of everything else, losses due to the “1001 
Risks”. One presentation tried to get the topics of operational risk on one slide. 
The slide contained such small font that it appeared black. Operational risk can 
be thought of as “Quantifying Murphy’s Law” with large entropy of possibilities 
that can go wrong. Included here are human error, rogue traders, fraud, legal risk, 
organizational risk, system risk etc. Model risk could be regarded as operational 
(it has to go somewhere). The recent accounting and analyst scandals would also 
be classified as operational risk. The worst part about a major loss from 
operational risk is that it is always new and unexpected. 


Systemic Risk 


Systemic risk is concerned with the herding behavior of financial institutions in 
periods of crisis. An example is the crisis of 2008. With interlocked financial 
commitments, collective behavior during a crisis overwhelms the silo risk 
management of individual firms. 


Environmental Climate Change Risk 


Climate change with its recent global warming trend mostly due to human fossil 
fuel consumption is having increasingly negative impacts. Long-term risk 
management of financial and economic systems should and will have to include 
climate risk. 


* Wrong Way Risk deals with the case when the market moves against a company and its 
credit deteriorates at the same time. This depends on correlations between default 
probability and market moves. 
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When Will We Ever See Real-Time Color Movies of Risk? 


Soon after starting work on the Street in the mid 80’s, I had a vision of real-time 
risk management, with movies in color of risk changing with the market and with 
the portfolio transactions. I’m still waiting. Drop me an e-mail if you see it. 


Many People Participate in Risk Management 


In a general sense, a vast number of people are involved in risk management. 
These include traders, risk managers (both at the desk level and at the corporate 
level), systems programmers, managers, regulators, etc in addition to the quants. 
All play important roles. Commonly, risk management is thought of in terms of a 
corporate risk management department, but it is more general. 


Systems Programmers 


Systems play a large role in the ability to carry out successful risk management. 
Systems programmers naturally need expertise in traditional computer science 
areas: code development, databases, etc. It is often overlooked but it is 
advantageous from many points of view if programmers understand what is 
going on from a finance and math point of view. 


Traders 


Traders need to understand intimately the risks of the products they trade. 
Sometimes traders are quants on the desk. Risk reports are designed by analysts 
or traders and coded by the programmers. There are however innumerable 
exceptions—for example, a trader writing her own models on spreadsheets and 
generating local desk risk reports from them. Traders use the models while 
exercising market judgment to gauge risk. 


Risk Managers 


Desk and corporate risk managers need some quantitative ability and must 
possess a great deal of practical experience. Risk managers also have a 
responsibility to understand the risks of business decisions and strategies (e.g. 
customer-based or proprietary trading, new products, etc). 


Corporate Risk Management 


Corporate risk management aggregates and analyzes portfolio risk, and analyzes 
deals with unusual risk. Corporate risk management also performs limit oversight 
for the business units. The risk results are summarized for upper management in 
presentations. A corporate-level assessment of risk is extremely difficult because 
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of the large number of desks and products. Collecting the data can be a 
monumental task. Inconsistent risk definitions between desks and other issues 
complicate the task. 


Two Structures for Corporate Risk Management 


There are two common structures for corporate risk management. The diagram 
illustrates the alternatives of the two-tier or one-tier risk management structure: 


Two-Tier Risk One-Tier Risk 
Structure Structure 


Corporate Corporate 
Risk Mgmt. Risk Mgmt. 


Desk Risk Manager 


In the one-tier risk structure, the corporate risk management department is in 
direct contact with traders on a given desk. In this structure, the corporate risk 
manager follows the day-to-day trading risk details as well as participating in the 
other activities of corporate risk management. In the two-tier risk structure there 
is an intermediate desk or business risk manager. Here, the desk risk manager 
interacts with the traders. The desk risk manager then summarizes or emphasizes 
unusual risk to the corporate risk department”. 


? Risk Management Structures: The paradigm adopted depends on the risk- 
management philosophies of the trading desks and of corporate risk management. 
Different structures may apply to different desks. The two-tier structure requires a 
division of responsibilities and is now discredited because it makes risk managers 
dependent, which I saw in action. 
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Quants in Quantitative Finance and Risk Management 


First, what is a “Quant”? This is a common (though not pejorative) term mostly 
applied to PhDs in science, engineering, or math doing various quantitative jobs 
on Wall Street '°, also called The Street. 


Jobs for Quants 


We start with jobs involving models. Risk is measured using models. Here the 
standard paradigm is that models are developed by PhD quants writing their own 
code, while systems programmers develop the systems into which models are 
inserted. 

A quant writing a model to handle the risks for a new product needs to 
understand the details of the financial instruments he/she is modeling, the 
theoretical context, the various parameters that become part of the model, the 
numerical code implementing the model etc. The numerical instabilities of the 
models need to be assessed and understood. The model needs to be documented. 

Many other jobs for quants exist. They include risk management, computer 
work, database work, becoming a trader, etc. 

Lately model quality assurance (or “model validation”) has become hot. 


Looking for a Job 


If you are serious about pursuing a career, try to find people in the field and talk 
to them. Networking is generally the best way for finding a job. Headhunters can 
be useful, but be aware that they probably have many resumes just as impeccable 
as yours. At this late date, there are many experienced quants out there. If you get 
to the interview stage, find out as much as possible about what work the group 
actually does. You have to want to do the job, be able to do the job, and be 
willing to give 110%. Enthusiasm counts." 


? “Wall Street”: This means any financial institution, not just the street in New York. 


'' More information on getting a job: Looking for a job is itself a full time job. You 
have to get out there. Meet and talk with as many people as possible. Develop a plan. Ask 
questions. Ask someone for the name of someone else. You should apply even if you 
don’t hit all the buttons on the job description. Be straightforward, friendly, and focused. 
For interviews, it is effective to tell short “stories” of things you did and what results you 
obtained. The hiring manager is looking to see if you can do the job he has to fill, and if 
you have chemistry with the team, all for the right price. 
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On the Job: What’s the Product? 


The product on the job depends on the situation. Changing conditions can and do 
lead to changing requirements. Flexibility is important. Don’t be afraid to make a 
suggestion — you may have a good idea. An essential piece of advice is to “Solve 
problems and don’t cause problems”. 


Creativity and the 80-20 Rule 


Creative thinking and prioritized problem-solving abilities are key attributes for a 
quant, along with the skill to apply the "80-20" rule (get 80% of the way there 
with 20% of the effort) in a reasonable time. 


Communication Skills 


Communication is critical. Decisions often must (or at least should) be made with 
input involving quantitative analyses. Specifically the skills needed to write clear 
concise memos or to give quick-targeted oral presentations in non-quantitative 
terms while still getting the ideas and consequences across are very important and 
should not be overlooked". 


Message to PhD Scientists and Engineers Who Want to Become Quants 


For model building and risk management, you need to know how to program 
fluently in at least one language". No exceptions. Fluent means that you have 
had years of experience and you do not make trivial mistakes. Prototyping is 
important and extremely useful. Prototyping can be done with spreadsheets or 
with packages like Matlab or Mathematica, etc. However, prototyping is not a 
replacement for serious compiled code. Knowledge of other aspects of computer 
science can also be useful (GUIs, databases and SQL, hardware, networking, 
operating systems, compilers, the Internet, etc). 

For background, in addition to this book (!), read at least one finance text and 
some review articles or talks'. Conferences can be useful. Try to learn as much as 
possible about the jargon. Become acquainted to the extent possible with data 
and get a feel for numerical fluctuations. Be able to use the numerical algorithms 


? Exercise: Please note the practical and amusing but really dead serious exercise in the 
next chapter. Communication skills are a major part of this exercise. 


В Language Wars: It is amazing how heated discussions on computer languages 
resemble fights over religious dogma. It is easiest to go along with the crowd, whatever 
that means. See Ch. 34. 


" Jargon: People will assume you know the jargon, which is full of traps. 
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for modeling, including Monte Carlo and diffusion equation discretization 
solvers. Learn about analytic models. 
Learn about risk management. 


Message to Quants Who Want to Become Quant-Group Managers 


If you learn too much about quantitative analysis, finance and systems—and if 
you can manage people—you may wind up as the manager of a Quant Group. 
You now have to work out the mix between managing responsibilities and 
continuing your work as a quant. 

Managing quants can be rather like a description of the Israeli Philharmonic 
Orchestra when it was founded: the orchestra was said to be hard to conduct 
because all the players thought they should be soloists. There are good books 
about managing people", and there are in-house and external courses. My advice 
is to be genuine, work with and alongside your quants, understand the details, 
understand the difficulties, gain the group’s respect, set achievable goals that are 
appreciated by the management, and generally be a leader. 

Depending on the situation, you can have the option of providing innovative 
thinking and leadership while working hard and hands-on. You need to have the 
strength to work independently. You need to continue to give an effort of 110%. 
You need to deliver the product, but be very careful about definitions. Try never 
to use pronouns, and especially not the pronoun "it" "^. 

You will broaden your horizons, meeting smart, friendly experts who can 
teach you a lot, interacting with management, and experiencing the sociology of 
the many tribes in finance, all speaking different languages. Always assume that 
you can learn something. 

Sometimes you will need courage. While most people are up-front and 
helpful, you will encounter a variety of sharks. You may also need to cope at 
various times with adversity, possibly including: misunderstanding, tribalism, 
secrecy, Byzantine power politics, 500-pound gorillas, wars, lack of quantitative 
competence, sluggish bureaucracy, myopia, dogmatism, interference, bizarre 
irrationality, nitpicking, hasty generalizations, arbitrary decisions, pomposity, and 
unimaginative people who like to Play Death to new ideas. 

All in, it is fascinating, challenging, and even fun. Bon voyage. 


5 The Dangerous “It” Word and Too Many Pronouns: The pronoun “it” is probably 
the most dangerous word in the English language, leading to all sorts of 
misunderstandings and friction. More generally, people speak with “too many pronouns”. 
What you refer to by “it” is in all probability not exactly what your interlocutor is 
thinking, and the two concepts may not be on the same planet. You can be burned by “it”. 
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3. Ап Exercise (Tech. Index 1/10) 


This practical and amusing (but dead-serious) exercise will give you some 
glimmer in what it can be like carrying out a few activities in practical finance 
regarding a little data, analysis, systems, communication, and management 
issues. The exercise is illustrative without being technical. There are important 
lessons, the most important being communication. The idea is not just to read the 
exercise and chuckle, but actually try to do it. 

Remember, the footnotes provide a running commentary and extension of the 
topics in the main text. Footnotes are actually sidebars and form an integral part 
of the book. 


Part #1: Data, Statistics, and Reporting Using a Spreadsheet 


Step 1: Data Collection 


Find the 3-month cash Libor rate and the interest rates corresponding to the 
prices of the first twelve Euro-dollar 3-month futures’. Keep track of each of 
these thirteen rates every day” for two weeks? using the spreadsheet program 
Excel*" and note the rate changes each day. 


' Libor and ED Futures: There are a number of different interest rates used for different 
purposes. You need to spend some time learning about the conventions and the language. 
Libor is probably the most important to understand first. The “cash” or “spot” 3-month 
USD Libor rate is the interest rate for deposits of US dollars in banks in London starting 
now and lasting for 3 months. The related Eurodollar (ED) futures give the market 
"expected" values of USD Libor at certain times called “IMM dates” in the future. ED 
futures are labeled by MAR, JUN, SEP and DEC with different years, e.g. SEP04. The 
interest rate in %/year corresponding to a future is [100 - price of the future]. 


? “The Fundamental Theorem of Data”: Collecting and maintaining reliable data is one 
of the Black Holes in finance (this statement is a Theorem). What you are being asked to 
do here is to get a tiny bit of first-hand experience of how painful this process really is. 
Notice for example that you weren’t told where to find the data. 


? Time: Two weeks is 10 business days. Time in finance is sometimes measured in weird 
units. For example, one year can be 360 days (used for Libor and ED futures). 


> Spreadsheets: If you don't know much about spreadsheets, regardless of what you 
know about programming or quantitative packages, this seemingly trivial exercise is 
23 
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Step 2: Statistics and Reporting 


At the end of the 2 weeks, calculate the average and standard deviation of the 
nine changes of the first futures rate, expressed in bp/yr*. Type exactly what you 
did on the top of the spreadsheet with the dates and label everything". Next, use 
the Excel Wizard to draw a graph of these changes, label the graph clearly 
including the units, and print it out. Also, print out the spreadsheet as a report’. 


Step 3: Bis for Steps 1,2 


Repeat the above two steps for the 10 daily differences of the first future rate 
minus the cash Libor rate’. 


Step 4: Correlations 


Calculate the correlation of the daily changes of cash Libor with respect to the 
daily changes of the first futures rate". Next, define the "Return" from time t to 
time t + 1 by this daily change of the rate divided by the rate at time t. Calculate 
the correlation of the cash Libor returns with respect to the first future rate 


already a potential problem for you. Spreadsheets are ubiquitous. For quants, 
spreadsheets are useful for prototyping. The alternative can be a restriction in your 
employment possibilities. Excel is the de-facto standard spreadsheet. 


^ Basis Point bp: A basis point is %/100 or 0.0001 in decimal. Time-differences of rates 
(and also spreads, i.e. differences between different rates at the same time) are commonly 
quoted in bp/yr. Usually the /yr unit is left off. 


? Spreadsheet Labeling and Organization: Clear spreadsheet formatting is key to help 
prevent errors and confusion that easily arise, especially in large spreadsheets. This is not 
to mention confusion for yourself if you come back in 6 months and try to understand 
what you did. One handy tip is to use colors with bold type for important quantities (e.g. 
input numbers green, intermediate results yellow, output results blue). Unlabelled 
spreadsheets create misunderstandings. 


* Graphs and Reports: Graphs and reports are ubiquitous in risk management. Reports 
that are clear to people apart from the creator of the report are sadly not always the norm. 


7 Why do Another Calculation? There are many reported quantities. This one measures 
Libor curve risk. This step gets you initiated to repetitive work that can be part of the job. 


* Correlations: Correlations are critical in risk management. We will spend a lot of time 
in the book discussing correlations, including how to stress them consistently. By now 
you should have figured out that a spreadsheet has built-in canned functions to do 
correlation calculations and many others. 
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returns. Now look at how different the result is for the correlation using the rate 
returns from the correlation using the rate changes’. 


Step 5: Written Communication 


Write a two-paragraph memo about what you did, clearly, carefully and neatly 
enough so you could turn it in to your old English professor’. 


Step 6: Verbal Communication 


Staple your nice spreadsheet report and graph to the memo, and hand over the 
package to somebody. Tell her what you did in no more than 3 minutes, and ask 
her to spend no more than 3 minutes looking at the material. Ask her to feed back 
what she understood"'. 


Step 7: Celebrate 
Go have a beer". 


Analysis of Part #1 


You were just walked through a soup-to-nuts exercise. Each step corresponds to a 
common activity. This included written and verbal communication.. 


? Definitions: The correlations for differences and the correlations for returns are not 
exactly the same. Many risk measures have different possible definition conventions. You 
may have to dig deep to find out which definitions are being used in a report. 


10 Written Communication, Management, and Goat Language: This part is important. 
You shouldn’t skip it because if you can't write what you did clearly, you may not be paid 
as much. Assuming you do not report to another quant, your potential Wall Street 
manager will not speak your language and is probably neither willing nor capable of 
learning it. The communication of even a summary of technical information or its 
significance is often hard. On the other hand, some managers have excellent intuition, 
understand the thrust of a technical argument quickly, and make valuable suggestions for 
improvement. You will be lucky if you report to such a person. You should make memos 
as clear and simple as possible without sacrificing the message. A wise manager, Gary 
Goldberg, gave some good advice for quants to use simple “Goat Language”. Good luck. 


п Verbal Communication and Management: This part is important and difficult. 
Again, you shouldn’t skip it because if you can't clearly describe orally what you did, you 
won't be paid as much. By the way, your manager may only speed-read your memos. 
Face-to-face communication may be the only way you can transmit your message. You 
will probably not be given much time for the meeting. Hit the important points. Be 
prepared for a possibly arcane experience. You have to try to learn how to adjust. 


12 Beer: This is not pointless, and gives some idea of the sociology. Still, this activity is 
not as common as some people might imagine. People work hard and go home. 
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Communication skills are essential, which many quants perform badly, to their 
detriment. 

The worst error consists of using the pronoun IT ". 

Hopefully you used Microsoft Word or Word Perfect (not the vi editor in 
UNIX) to write your short report so that it looks like a professional document that 
a manager will take seriously. Appearance counts". 

For you PhDs who feel insulted by the trivial technical aspects of this 
exercise, be aware that most quant work on the Street is not academic. You may 
well have sideline activities like those described above — though the work can 
more difficult, fast moving, and challenging than you might imagine. 


Part #2: Repeat Part #1 Using Programming 


Instead of the spreadsheet, write a program in your favorite computer language 
along with file inputs to perform the same steps as in Exercise Part #1. Document 
your source code'^ by clearly writing at the top what it is you are doing, in good 
English with complete sentences. If you skipped the memo and verbal 
communication, it's time to bite the bullet '’. Print out your report and graph". 


Analysis of Exercise, Part #2 


You have just carried out the same exercise in "production mode", as opposed to 
the spreadsheet "prototype mode". 


? Second Warning: The Most Dangerous Word in the English Language is “It”: 
Again, the probability is 100% that every person will have a different definition of the 
word “it” for any given reference. Besides the confusion generated, you can get severely 
burned if the management thinks you are saying something or promising something other 
than what you are intending. 


14 Appearance of Documents and Presentations: There are people who have greatly 
improved their careers producing easy-to-read documents and PowerPoint color 
presentations. Upper management is usually NOT interested in the details and IS 
interested in getting summary information quickly and painlessly. You can learn from 
these people. 


16 Source Code Comments: Good programming practice, remember? I once had to try to 
make sense out of some complex mortgage code that had no comments at all. The remark 
of the programmer was that the code was completely obvious. *&*(“%$#. 


" Written and Verbal Communication: No, you can't skip this activity. 


Graphs and Programmers: Can you (ahem!) produce graphs using your compiled 
code? It is hard to describe the frustration with systems groups that as a matter of 
"principle" only work with compiled code and hate spreadsheets, but have trouble 
producing reasonable reports and graphs. 
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Part #3: A Few Quick and Tricky Hypothetical Questions 


Question 1: System Risk 


What would you estimate to be the amount of data such that programming would 
be preferred over spreadsheets? '° Under which situations would you recommend 
replacing the files with a database? Next, suppose you are ordered to take over 
either the spreadsheet or the source code from somebody who has left the 
company and has documented nothing". Now which approach would you favor? 


Question 2: Should we do this Deal or not? 


Say you have to estimate the risk for a Backflip Libor option lasting for 1 year. 
What time period of historical data would you recommend in order to get a 
handle on the potential risk of this animal, and when will you have the answer? 


Question 3: Market Risk 


Based on the ridiculously small 10-day sample, if your boss came to you right 
now, what would you say would be a reasonable measure for Libor risk?!” 2°. 


^ Programs vs. Spreadsheets: Portfolios can have hundreds of correlated variables and 
thousands of deals. On the other hand, you may need to provide an answer by 2 p.m. for a 
risk analysis depending on a few variables for a deal perhaps about to go live. 


17 Personnel Risk: The situation described is not academic — it happens all the time. By 
the way, did you document your code? 


18 The Backflip Option and the Time Crunch: You have never heard of the Backflip 
Option; the name is fictitious. In practice, you may not have time to analyze a 
complicated option in detail or even get the precise definition much in advance. You need 
to get used to the pace — you're not going to publish a journal article. In fact, the desk 
wants an answer by 2 pm. So based on the information, what are you going to do? 


? Management Communication Will Not Go Away: Your boss knows nothing about 
statistics other than having a hazy impression of the basic concepts. Do you think that 
he/she understood what you said? It is critical for your career that the management 
understands what you are doing and why it is important. The challenge for you is that 
their eyes may glaze over after two minutes of explanation. The particular issue of Libor 
risk in the example will not come up because industrial-strength databases and quants 
have solved it, but the communication issue is always there. 


? Simple Procedures, Accuracy, and Communication: The use of simple procedures is 
a double-edged sword. An important potential advantage is better communication to non- 
technical management. The downside is that management may neither understand nor 
remember the limitations of a simple procedure. Try to get a handle on the uncertainty. If 
the approximation is reasonable, communicate up front that you are using a simple but 
reasonable approximation, possibly with the aim of improving it as priorities permit. 
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Messages and Advice 


To Computer Programmers Who Skipped Exercise Part #1 


Now is the time to go back and do Exercise Part #1 in Excel that you may have 
skipped, no matter how impure you find it and no matter how much you hate 
Microsoft. You will be regarded as more useful and valued by a business unit if 
you ALSO know how to use a spreadsheet for quick calculations, reports, and 
prototyping. Please try writing the memo and describing your work verbally — 
these skills will really be useful for you. You will be definitely be more effective 
by learning something about finance, even if you are a programmer. And stop 
talking about martingales. 


To Those Who Can’t Program and Skipped Exercise Part #2 


So maybe you are on your way to sales or investment banking. Still, this is a 
good time to learn at least the rudiments of a compiled computer language if you 
don't already know one. Even if you never have to program in your future career 
on the Street, the chances are high that you will be interacting in some fashion 
with computer people. The more you know about systems and the way the 
technology guys think, the better you will be able to communicate with them, 
understand what the problems are, and get done what you want. Otherwise, 
computer land can turn out to be a frustrating black-box experience. 
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4. Equity Options (Tech. Index 3/10) 


In this chapter, we begin the analysis of standard or “plain-vanilla” equity 
options". Similar remarks will hold for other options, e.g. foreign exchange (FX) 
options. Just for balance, we treat some topics in the FX options chapter that are 
directly applicable to equity options and vice-versa. We will treat the subject 
qualitatively, reserving the formalism for later chapters (see especially Ch. 42). 


Pricing and Hedging One Option 


In order to present the material in the way you might encounter it on a system, a 
spreadsheet format is shown. Following each section are some quick comments’. 
We begin with the deal definition, which specifies the "kinematics". This 


information is put into the official database of the firm's “books and records"? 


Deal Definition 

1 Deal ID ABC123 
2 Option type Call 

3 Strike $E $100 

4 Calculation date 6/25/02 
5 Settlement date 6/27/02 
6 Expiration date 6/25/04 
7 Payment date 6/27/04 
8 Principal $1 MM 
9 Divide option by spot?" No 

10 Divide option by strike? No 


' Acknowledgements: I thank many traders, especially Larry Rubin and Alan Nathan, for 
helpful conversations on practical aspects of equity options. 


? Comments for System: If you ever had to deal with a derivatives pricing system, you 
know that the definitions of the various quantities on the screen are sometimes not totally 
clear. If comments are put into help screens, time is saved and some errors can be 
prevented for the uninitiated user. 


> Other information is related to the counterparty ABC (legal entity etc) 


* Details like #9, #10 may be in the deal specification and the model will have to cope. 
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Comments: Deal Definition 
For database ID purposes 
Call Option: Option holder has right to buy something 
Option is at the money if strike = stock price 
Date the calculation is done (e.g. today) 
Payment for the option is made at settlement date 
A European option has one expiration date 
Any option payout, if it exists, is paid at this date 
Normalization factor 
If Yes, delta will be changed 

0 Another convention 


— ооо м с л & ошо м н 


The parameter inputs are shown next. These describe the “dynamics” of the 
deal related to the option model. The types of parameters form part of the model 
specification. The numerical values of the parameters are not part of the model’. 


Parameters 

1 Spot Price $Spot $100 

2 Interest Rate r 5% 

3 Discounting Spread s 0 bp 

4 Dividend Yield y 1% 

5 Implied Volatility с 15 % 

6 Type of Volatility Lognormal 
7 Skew Correction do 0% 


Illustrative Comments: Parameters 

1 “Spot” means current (today's) price 

2 Need to specify type of rate, units 

3 Some models have two interest rates differing by a spread 

4 Alternatively to dividend yield, can specify discrete dividends 

5 Implied Volatility produces the option price using model formula 

6 “Lognormal” = model assumption of Gaussian behavior for log price changes 
7 Volatility Skew: Adjustment of volatility designed to fit certain option data 


The results of the model are shown next. First is the option price. 
Price 


1 Option value $C $12.225 


> Definition of a Model: The inclusion of the types of parameters as a part of the model 
is not an idle formality. The numerical algorithm only specifies part of the story. 
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Comments: Price 
1 Discounted expected value of payout cash flow 
(Either analytic or numerical solution to some accuracy) 


Next, the hedging details of the option are presented. Notice that there is a 
variety of conventions for the hedging quantities. For example, “gamma” can be 
defined in practice in many different ways, both analytically and numerically. 


Hedging 

1 Delta А 0.6698 
2 ^ Fixed Dividend 0.6828 
3 $Delta $ A $66.983 
4 Strike Delta -0.5476 
5 Gamma y 0.01641 
6 $Vega $0.496 
7 $Rho $1.073 
8 $Phi -$1.294 
9 $Theta 


Comments: Hedging 

Standard A = $dC/$dSpot, may be numerical derivative 
Delta with fixed $Dividends (y = $Dividends/$Spot) 
$Spot * Delta, the third convention used 

Strike Delta $dC/$dE is approximately = - A 

Gamma y = $d^C/SdSpot^; many possible conventions 
Change $dC for do = 1% (sometimes 1bp is used) 
Change $dC for dr = 1% (sometimes 1 bp is used) 
Change $dC for dy = 1% (sometimes 1bp is used) 
Change $dC for dt = 1 day (see footnote) 


© со з с л Боо м н 


Did you notice theta didn’t show up? You found а bug! Great! See the footnote*. 
The hedging of one equity option takes place using Taylor’s theorem. We get 


* Good Work, You Found the Bug: Now that you've discovered the bug, you get to 
have the experience of dealing with it. Maybe theta has been calculated and just doesn't 
appear (GUI bug), in which case you will be talking with systems personnel. They are 
hassled, and will probably put your request at the end of a long priority list. You may 
now have to go to meetings to monitor the status of the system. Maybe there is a 
calculation bug — something wasn't defined etc. In that case, perhaps you will get to 
debug the code to calculate theta. This means you will decipher a lot of code written by 
other people, some long gone, with less than sterling documentation. Before any work the 
programming budget has to be there, and so fixing bugs like this impacts the business 
planning cycle. Aren't you glad you found the bug? 
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dC =A-dS +4y-(dS) + Vega-do +Rho-dr + Phi - dy + Theta -dt +... 
(4.1) 


Higher order “cross” terms can be important especially for options near 
expiration and/or large moves in the underlying. However, the traditional 
characterization of an option and risk reporting is done with hedging parameters. 


American Options 


For risk management, we need to know the risk as a function of future time. For 
European options, the expiration date is unique. For American options," which 
can be exercised at any time, there is no definite expiration time’. American 
options must be valued with numerical codes, discretizing the diffusion equation 
(corresponding to an assumed random movement of the stock price, along with a 


drift term specified by “no-arbitrage” constraints). The hedges (^, у...) for an 


American option need to be determined numerically. The question is how to 
distribute the risk as a function of future time. This can be done with the 
probabilities of exercise at different times. Consider the illustration below. 


American Option Probabilities of Exercise = 
[0%, 10%, 30%, 20%, 10%] for bins shown 


< Number of 
paths crossing the 
Atlantic or critical 
path for each bin 


h 
Stock price 
paths 


7 Other Types of Exercise: A Bermuda option can be exercised only on certain dates. 
Sometimes there is a lockout period during which exercise is not allowed. These types of 
options are common with interest rates. See Ch. 43. 


8 More recently, an “American Monte Carlo” procedure with interpolation has been used, 
which we discuss later in the book. 
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The probabilities of exercise at different future times? can be found by a path 
counting technique". The "critical path" (or as I used to call it, the “Atlantic 
path"), corresponds to the stock price as a function of time at which American 
option exercise occurs. Once the critical path is determined, the fraction of paths 
crossing the critical path in each interval in time can be found, for example by 


Monte Carlo simulation. This fraction defines ZF (t) dt, namely the 


Exercise 
probability of exercise at time f in dft . The above picture has several bins in time 
for ten paths with a stylized (constant) critical path. 

The determination of the actual critical path is done using a back-chaining 
procedure from last exercise time along with a dictum of maximizing profit—if at 
a given time, option exercise is more profitable than the expected return holding 
the option, the option is exercised. 

Given the probabilities of exercise the hedges can be distributed in time as 


A(t) =А.27. (ї) for t< T (maturity). The remainder of A is put into 


Exercise 
A(T). This procedure, while accurate, is cumbersome. Physically, the risk is 


mostly at the last exercise time if the option is far out of the money (OTM), and 
at the first possible exercise time if the option is far in the money (ITM)''. 


Basket Options and Index Options 


We have used the word “stock” as if it were a single stock. However the most 
common stock options correspond to stock index options, e.g. options on the 
S&P 500 index. The index В is a linear combination, or “basket”, of stocks 
{S, \ with certain weights {w, } , viz B(t) = > wS, (t). Hence, an index stock 
option is actually an option on a basket. 

The most common theoretical assumption made for stock price movements is 
“lognormal” - namely relative changes or returns 26 (t) = d S (1) | S (t) of a 
stock price are distributed normally. In small (formally infinitesimal) time 
interval df starting at time ¢ and ending at time t+dt, we write 
dS / S = dto dz.. The symbol d, indicates time difference. Here dz 

a a a a a t а 


? Notation: These probabilities of exercise for Bermuda/American options are 
generalizations of the European option probability of exercise measured by the function 
commonly called d. or d». See Ch. 42. 


10 Path Integrals and Options: Path integral formalism with applications: Ch. 41-45. 


П OTM, ITM, ATM Notation: OTM = Out-of-the-money, ITM = In-the-money, ATM 
= At-the-money. These abbreviations are used throughout the book. 
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is a Gaussian random variable with width «df . The drift рш, , “volatility” Us 
and dz, are all functions of time". A little arithmetic shows that 


d B(t)/ B(t) # yw RO, i.e. the return of the index is not the weighted sum 


of the returns of the components. For that reason, the return of the basket is not 
lognormal if the components are lognormal. Instead, we find that 


d,B(t)/B(t)= Y w S, O| uj (dt o, (dz, OZ ws O 4.2) 


We suppose {42 | are correlated according to (dz dz,)=p,,0,0,dt, 
a а= В Pop a~ p 


and suppose that the basket and stock values do not vary too much from their 
values at some specific time £, . Denoting (ХҮ In - (XY ) = (X ) (Y ) we obtain 


idi ([d,B()/B00T ) = YW, wp, S, 5,0, dt [aa] 
ap 


(4.3) 


с 


With these assumptions, the lognormal assumption with the basket volatility 
©» is then used for baskets. The basic reason is that, even though somewhat 
inconsistent, anything else would be infeasible. Sometimes, again rather 
inconsistently, a time-dependent basket volatility using the above formula is 
employed replacing the time f£, with an arbitrary time f. 

In practice, the basket volatility may use the above formula for small custom 
basket deals. For index options—e.g. the S&P 500—the index volatility с; is 


just parameterized and no attempt is used to build it up from the components. 
This is because the detailed information simply does not exist. 


Basket Options in Disguise: Swaptions 


Although we have been discussing equity basket options, an exactly similar set of 
steps is used to derive the volatility for an interest-rate swap break-even rate. 


? Skew: In addition, the volatility can depend on the stock price; this is called “skew”. 
We will discuss skew at some length in Ch. 5 and 6. 


P Correlation: The average is statistical. Mathematically an infinite set of Gaussian 
random numbers is supposed to be used for the average, but of course, this never happens 
in practice. We discuss correlations thoroughly in Ch. 22-25, 37. 


" Living With Inconsistency: This is one of the reasons why it is not too profitable to 
try to axiomatize option theory with a lot of unnecessary mathematical rigor. 
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There the stock prices are replaced by “forward rates”. The swap is a weighted 
“basket” of forward rates minus a swap rate, and the “break-even” swap rate is 
weighted “basket” of forward rates. The nominal model for forward-rate 
fluctuations is again lognormal, and the break-even rate is (again somewhat 
inconsistently) assumed lognormal. We will examine swaptions in Ch. 11. 


Other Types of Equity Options; Exotics 


There are a large number of types of equity options, many exotic. “Exotic” 
basically means something that cannot be priced using a Black-Scholes formula 
with simple parameters. We will deal with a representative set of exotics, but it 
would require a heavy volume to categorize the full panoply of equity options. 


Portfolio Risk (Introduction) 


Once we have the future time distribution of hedges for each deal, we can add up 
the risks for all deals in a portfolio. Then we know in principle what the portfolio 
risk looks like as a function of future time Т as seen from the valuation date 


today 1. Now the future risk for each deal changes as the calendar time f, 


moves forward. Moreover, a portfolio of deals changes as a function of time as 
new deals are done, old deals are exercised or expire, hedges are modified etc. 
For this reason, the risk must be monitored regularly. 

The periodicity of the risk reporting depends on the user. For the trading 
desks, the risk reports are run daily (usually overnight). For corporate reporting 
purposes, the time scales for risk reporting are much coarser, for example 
quarterly. The risks of particular deals may be monitored by the corporate risk 
group on an as-needed basis. 


Scenario Analysis (Introduction) 


It should be clear that the use of the Greeks will not work well if large moves in 
the stock price occur. For example, a short" put option with a large notional that 
is far OTM"! might contribute negligibly to the risk. However, if the market 
suddenly tanks'5, this option can suddenly become ATM with large risk. 
Conversely, an apparent large risk may not mean much. For example, a short put 


Long, Short: “Long” means the option was bought; “short” means the option was sold. 


^ Jargon: The phrase “the market tanks” is a technical term that means the market is 
quickly going to pot. 
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option ATM near expiration with large negative gamma may have little real 
downside risk”. 

For this reason, scenario analyses assuming various types of large moves are 
analyzed from time to time. These scenarios can be one of three types: 


Scenario Type 1. “Ad-hoc” scenarios (e.g. suppose the stock price drops 59%) 
Scenario Type 2. Historical scenarios (e.g. the worst case stock price drop = 37%) 


Scenario Type 3. Statistical scenarios (e.g. 99.9% CL stock price drop = 42%) '® 


We will look at scenario risk analysis for portfolios in some detail in Ch. 12 
and also in Ch. 39. 
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17 Homework: Do you believe this statement? 


'S Statistical Scenarios: The reader may wonder how a statistical scenario based on 
historical information can be worse than the worst-case historical scenario. This is 
because the statistical scenario typically will be defined using a Gaussian (bell-curve) fit 
to the tails of the distribution — e.g. at a 99% CL — and then extrapolated to the desired 
confidence level. We discuss such subjects in Ch. 21. 


5. FX Options (Tech. Index 4/10) 


In this chapter we discuss foreign-exchange FX derivatives. We start with FX 
forwards and simple FX options. We give some practical details for FX options, 
including hedging with Greeks. We introduce volatility skew (or smile). We give 
some examples of pricing exotic barrier FX options. We present the “two-country 
paradox". We discuss quanto options and correlations, FX options in the presence 
of stochastic interest rates, and comment on numerical codes and sanity checks. 


FX Forwards and Options 


Consider the picture for the idea of “interest-rate parity” for FX forwards. 


Interest-Rate Parity Diagram Commutes 
Discount back from 1^ using rr 


Discount back from г using r, 


40 Quantitative Finance and Risk Management 


Say we are in the US with US dollars (USD). We can convert the USD to 
XYZ currency using the spot FX rate at time ¢,, put the XYZ amount in 


the XYZ bank at foreign rate r, between 1, and Ü , and finally convert back at Ü 


from XYZ to USD using the forward FX rate at f, defined below. The result 
should be the same as if we kept the USD 1n the USD bank at domestic rate 


r, during the same time period between ¢, and m Equivalently, we can take the 
proceeds at t in each currency and discount back to f£, at the respective rates to 


compare the results at £,. The diagram commutes; we can go around it either 


way. The resulting formula for the forwards is given below. 
We use the following notation relative to the USD. 


_ #Units (XYZ) 
|. OneUSD 


#USD 
= 5.1 
Я OneUnit ( XYZ) vu 


I used to live in France, and even though the Euro now exists I still think in 
French francs and recall definitions using 7 = 5- FF/USD ,£ ~ 0.2-USD/ FF '. 


Market quotes can exist using either 7 or £. There are old traditions, e.g. € for 
GBP and for the currencies of the British Empire. Sometimes 77 is called a 
“European quote" and € an “American quote". My definition for ë is the same 
as in Andersen's book ‘ and is like any price, e.g. & —$0.50/apple. Other 


conventions exist: “GBP-USD = 1.50" or “GBP/USD = 1.50" instead of 
& —1.50- USD/ СВР here. Cross FX rates UVW / XYZ are treated in exactly the 


same fashion. You need to know the conventions. 


Pricing FX Forwards and FX European Options 


Call the spot FX rate €, and call the domestic and foreign interest rates? 
к=) Gar ry =, (c) for a time interval т’. The forward FX rate Š iwa 


derived from the interest-rate parity argument given above is 


' Old Habits Die Hard: Prices in French stores are quoted in FF side by side with Euros. 
Moreover prices for land as recently as 1980 were quoted not in FF but in “anciens 
francs”, each worth 1/100 FF. 


? Notation: Interest rates are exhibited as continuously compounded. 
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ба = EXP I" uL. ) | (5.2) 


Note that the continuously compounded foreign interest rate acts like а 
“dividend yield” in the equivalent formula for equity options. The “ЕХ swap" is 
the forward - spot difference 4, = $ wa — So» multiplied by some conventional 


swap 
normalizing power of 10. | 
Models for FX options are similar to those used for equity options. One 


difficulty is conceptual; it is important to label things and keep track of the units. 
Assuming lognormal behavior of & (t) and using the forward & ља » the usual 


formalism goes through, as first derived by Garman and Kohlhagen. The option 
prices and deltas today f, , up to an overall normalization, are? 


Conn pat (Coto) = expl -rs IES (+4, )+ £,N(+d,, )| (5.3) 


Aipu (ot) = terr] 


м(+а.) (5.4) 


Here the standard cumulative normal distribution is N(x), where 


N(x)= j meh) (5.5) 


We have N(x)+ N(-x)- 1. 

Below is the discretized cumulative normal distribution or “confidence level 
CL” graph, plotted vs. the number of standard deviations “stdevs” x = —3...+ 3. 
Thus x = 2 stdevs corresponds to the 97.7% CL. Also 2.33 stdevs corresponds to 
the 99% CL, and 1.65 stdevs corresponds to the 95% CL. 

The partial integrals between stdevs, labeled “Gaussian”, are listed below the 
corresponding upper integral endpoints (thus, 34.1% printed in the box below 0 is 
the integral from -1 to 0). We have N(0)=0.5 and N(1)— N(-1) = 68.2% 
(the probability between + 1 stdevs). 

Memorize this graph. 


? Options Formalism: We will get а lot of math later; here we just quote the formulae. 
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Normal Distribution vs. # Stdevs 
100.0% 
90.0% 
80.0% 
70.0% 
60.0% 
50.0% 


40.0% 


See values in table 


30.0% 
20.0% 
10.0% 
0.0% 
-3 -2 -1 0 1 2 3 
oe Cumulative 0.1% 2.3% 15.9% 50.0% 64.1% 97.7% 99.9% 
—"*Gaussian 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 


In Eqn. (5.4) Aet) = 0C/0€, is the textbook definition, although other 


definitions exist. Also, 


“ыйы = n. [кже /2| (5.6) 
Oo 


T 


These formulae are for a call or put on the same currency XYZ, or 
mathematically just the Black-Scholes formula for a call or put on the variable 


é . They need further normalization, discussed below. Also the volatility o() 


is appropriate for the expiration at time t = lyt T. 

Now а call option on USD physically is the same as a put option on XYZ. 
This is because if USD appreciates with respect to XYZ then XYZ depreciates 
with respect to USD. It is important to note that for an option on USD, we are in 
XYZ-land. Then the natural option variable is 7 and the “domestic” and 
“foreign” rates are reversed ^ Written as a function of п, we have 


^ Lognormal Behavior for £, т and a Preview of the Two-Country Paradox: Because 
n = 1/6 , lognormal behavior of & implies lognormal behavior of n. However making this 
change of variable modifies the drifts. This leads to a conundrum, discussed below, the 
“two-country paradox". The paradox does not show up for ordinary options. 
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1] ља = EXP IG = ) zm, (5.7) 


Cost ho)" exp[ -rt ] £n wu N (4d, )¥ E,N(+d,,)| (5.8) 
A cat pa (Most ) = Жер [-һт` |N(+a,,) (5.9) 


d 


ida, = Enn], )+ о? T "| (5.10) 


Some Practical Details for FX Options 


There are several ways to quote the results for options and a number of different 
conventions for reporting the Greeks. There are also modifications of the option 
formula to take account of the specific features of the FX options market. 


Normalization of FX Options 


The overall normalization factor for the option formula is important. We need to 
express the option price in units of a definite currency and get rid of the FX 
currency ratio. The same option can be reported in four ways: 

Method 1. Call on XYZ using the variable €. Normalization to get USD 


units: Divide by spot é , multiply by USD notional pp- 

Method 2. Call on XYZ using the variable €. Normalization to get USD 
units: Either multiply by XYZ notional Ay, or divide by strike E. and multiply 
by USD notional 2, = E.f^,. 

Method 3. Put on USD using the variable 77. Normalization to get XYZ units: 
Divide by spot 7j, , multiply by XYZ notional Z-z . 

Method 4. Put on USD using the variable 77. Normalization to get XYZ units: 
Either multiply by USD notional Z7,, or divide by strike E, and multiply by 
XYZ notional 2,, = E Rsp- 


1 


While all this may seem trivial (not to mention annoying), the output from 
some model will lead to confusion until you know the convention. The existence 
of the different conventions is because the deal may be booked either in the US 
or in XYZ-land. 
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For example, suppose XYZ = MXN (Mexican peso). If X lives in Mexico 
and buys a USD put, X pays in MXN. From X’s point of view, a put on 


something (e.g. 22, dollars) is the right to sell that thing for a fixed number 
Aw = Ё, р ОЁ pesos, so X naturally would use the definition #4. 


However, say the broker-dealer BD (who sells the USD put to X) is in the 
US. If the option pays off, BD will have to give Z yy pesos to X in exchange for 


Psp dollars. Then BD must purchase Æ y pesos at expiration г with a 
number £i Ex 25 / 9 of dollars, where 9 is the exchange rate at t. Hence 
from BD's point of view, the put is normalized at time ¢ by 1/7(¢), or today f, 
by 1/7, (with spot e.g. ту, = 10-MXN/USD ). Hence the BD uses definition #3. 


Hedging FX Options with Greeks: Details and Ambiguities 


The Greeks are used for option risk management. It is important to understand 
that the Greeks do not have unique definitions. Although this information is in 
this section on FX options, the same remarks in this section apply to equity 
options. For FX, there are ambiguities depending on the normalization, as above. 


Delta 


Because the spot changes day-by-day (indeed minute by minute), if the option is 
divided by the spot there is a change in the formula for the delta hedging of the 
option. Besides the variation of normalization factors 1/7, , a modified delta can 


be defined by multiplying delta by 7,. The bottom line is that there are several 


possible definitions of delta. When trying to understand FX option risk reports 
(usually just labeled cryptically with words like “delta”) you obviously need to 
know the normalization conventions’. 

For plain vanilla options, ordinary differentiation can be used. More complex 
options require numerical code, possibly including “skewed” volatilities, and 
possibly with boundaries (e.g. barriers) or complicated boundary conditions (e.g. 
American options). Then numerical difference procedures must be used to get 
delta and gamma. Sometimes symmetric differences are used, and sometimes 
only one-directional moves are used. A one-directional move can become 
problematic for a barrier option when spot gets near the barrier. 


5 On Getting the Conventions: This may not be so easy to do. Here is a hypothetical 
scenario. The guy who wrote the code for this risk report left the company last month. He 
was so busy producing other customized reports for the new head trader on the desk that 
he had no time to document anything (which he regarded as a waste of time anyway). Of 
course since I just made up that scenario, you will never see anything like it. 
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The amount of move in spot used for the numerical difference is another 
hidden variable. A move up (e.g. 1%) might cross a barrier, leading to a very 
different result than with a smaller move not crossing the barrier. 


Gamma 


There are at least nine different ways to define gamma, some with very different 
numerical values. Gamma may or may not be defined to include varying the 


1/7, normalization factor, and may contain other factors of 7, . 


Vega 

Vega can be defined by a continuous derivative, by numerical differences with 
volatility changes of 1%, by numerical differences with volatility changes of 10 
bp (sometimes followed by scaling the result up by 10), and by differences using 
a percentage of the volatility. Differences in procedure can be noticeable when 
the volatility is low and/or the option is away from ATM. 


Rho and Phi 
Rho (sensitivity to r,) and phi (sensitivity to ғ, ) can be defined by continuous 


derivatives, numerical differences with interest-rate changes of varying amounts, 
(1% or less, sometimes followed by scaling the result back up to an equivalent 
1% move). Differences in procedure can be noticeable when rates are high, which 
happens for emerging markets. 


Theta 


Theta is essentially a poor man’s simulator. There are many ways to define theta. 
Possibilities include moving the calculation date one day forward or one business 
day forward, moving the settlement day one day forward (or not moving the 
settlement date), leaving the spot rate fixed or moving the spot rate forward, 
using the same volatilities and interest rates or re-interpolating the volatilities and 
interest rates corresponding to the new dates. Sometimes a continuous derivative 
is used and the result has to be rescaled to get equivalent one-day differences. 


Higher Greeks (Speed, Charm, Color) 

Because the usual Greeks just represent the lowest order terms in the Taylor 
series of the option, some higher-order terms can be monitored. These include" 
Speed, Charm, and Color. Speed is the derivative of gamma with respect to the 
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underlying. Charm is the derivative of delta with respect to time. Color is the 
derivative of gamma with respect to time’. 


Other Option Details 


Other conventions include different time periods for the diffusion, for carry, and 
for discounting. Here is a picture of how FX options actually work. 


Time intervals for actual FX option pricing 


Diffusion interval 
= E 
Discounting interval 


The payment for the option premium (at settlement) is due some time after 
the deal is done or calculated today. Any payment from the option payoff takes 
place some time after expiration. These times are usually 2 business days. The 
diffusion takes place from today to the option expiration. The carry (i.e. the term 


from the interest rates r,,r, in the drift) takes place from settlement to the 


option payoff date. The discounting is from the option payoff date back to today. 
The extra discounting from settlement back to today is called “Tom Next". Other 
details include the usual plethora of conventions for interest rates and volatilities. 


FX Volatility Skew and/or Smile 


The skew dependence of volatility is generally thought of as a monotonic 
dependence on the option strike, viz o(£). The strike dependence is needed in 


* Envy? Maybe these names (charm, color) just prove that some finance guys really 
wanted to be high-energy physicists. 
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order to reproduce European option values using the standard Black-Scholes 
formula. A more complicated non-monotonic dependence is often seen in FX, 
called a “smile”.’ Here is a freehand picture illustrating the idea: 


Volatility smile: o(£) increases at both ends 


с(Е) 


Physical Motivation for the FX Volatility Smile 


The smile behavior for FX can be qualitatively understood using a “fear” idea. 
For illustration, say spot is 4, =1.50-USD/GBP or m ~ 0.67- GBP/USD. 


Consider an OTM GBP put for a low strike E," = 1.40- USD / GBP that pays 


Low 


off at expiration if E < E; . The fear that the GBP might depreciate this much 


can induce more investors to buy these puts. The extra demand raises the put 
price. Thus we get a premium on this volatility c(E "ud relative to the ATM 


vol, resulting in volatility skew. 
Now, consider an OTM call on the GBP for a high strike 


(say E li = 1.60- USD / СВР) that pays off if € > E и This option, on the 


other hand, is an OTM put on USD with a low strike in the inverse variable 


7 More About Skew: A lot more information about skew for equity options is in the next 
chapter. 
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p = 0.71- GBP/USD that pays off if 7 < Eg. The fear that the USD 
might depreciate this much can also place a premium on this volatility 
c(E a ). Since these two options are the same thing, c(E m) = c(E е ) ; 


7 7] Š 
Thus, vol premia for both low and high strikes can exist; this is the smile. 
Because the fear intensities are generally not equal, the currencies generally 
being of different strengths, the smile will not generally be symmetric. 


General Skew, Smile Behavior for FX 


A more general phenomenological parameterization of the complexity in the 
strike dependence of the volatility includes both monotonic skew and non- 
monotonic smile terms, and the mix is time-dependent. A study along these lines 


iii 


was reported for dollar-yen ". 


Fixed Delta Strangles, Risk Reversals, and Hedging 


A strangle is the sum of a call and put, and a risk reversal is the difference of a 
call and put. The FX option vol skew/smile is often quoted using “25-delta” 
strangles and risk reversals?. Since 50-delta would be ATM and 0 delta would be 


completely OTM, the 25-delta is “halfway” OTM. A call Су (са Ё) and 


сап =A pu = 0.25 for a given expiration 


put C,, (e. E pa Jare used with A 


time г. We need to find the strikes to produce this 25-delta condition, 
remembering that сү (Еш), O pu (E m are functions of the strike. Other 


delta values are also used, e.g. 10-delta. 

Let us examine the fixed delta conditions a little more. The relation between 
the strikes of the call and the put for a given value of delta is related to the 
forward. 

When the deltas are equal and opposite as for the 25-delta condition, we have 


d, can (сап ? Е ап ) = =d put (с, ? E ut ) (5.1 1) 


To get an idea of what this implies, consider the situation without skew where 
this relation simplifies. Up to a volatility term, the forward is related to the 
product of the call strike and the put strike. From equations (5.4), (5.6) we get 


* Risk Reversal Convention: The price of a risk reversal is the difference of the call and 
put prices on the USD. The market is also quoted in terms of the difference in the implied 
volatilities of the USD call and put. Again, remember that a USD call is an XYZ put so a 
positive risk reversal means a bearish market on the XYZ currency. Also, a risk reversal 
has other names: combo, collar, cylinder, ... 
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E puE call Е E find exp | oT | (5 Л 2) 


This means that if we own а call оп GBP with 25-delta, ignoring skew, we 
can hedge it locally with a put on GBP having the same delta (and opposite sign) 


provided we choose the put strike E according to the above relation. 


put 


Implied Probabilities of FX Rates and Option “Predictions” 


Formally, the second derivative of the standard option formula with respect to the 
strike is the Green function, including the discount factor’. The implied Green 
function or probability distribution function (pdf) Gi. is defined by 
inserting the market prices for options C Market Data (E ) expiring at some time P 


with different strikes. The second derivative is approximated numerically. With 
x either In or In 7], we get the relation 


e C Market Data ( E) 
дЕ? 


ec E;c(E 
Theoretically С, аһ > 0. Then sss ( ) 
plied p GE 


Сарней paf EY = In(£),t >| =E (5.13) 


> 0, leading to 


a “butterfly inequality” involving derivatives of c(E i} Define with Gatheral'" 
- ы | k= ae w=o'r, w'=Ow/dy, w"=C'w/dy’, and 
yw" f 1 


giw ( w)= f = е) qoe rU Jj zT + ij . Straightforward algebra yields 
w 


а Д5 8 o(E)| 
дЕ? 


-e (a, E (5.14) 


Jw 


Looking at the height of the implied pdf at some level, the option market's 
"prediction" today about the probability that the FX rate will find itself at a that 
level at time £^ in the future can be numerically ascertained. 

As mentioned, the physical basis of this "prediction" is just that the fear 
factor leading purchasers to buy XYZ puts at elevated prices or vols produces an 


? Green Functions: We will look at the math later; for right now just try to get the idea. 
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elevated implied pdf value for future XYZ currency depreciation. If these fears 
are realized, the prediction will come “true”. While the statement that “I am 
afraid XYZ will drop and will pay extra for put insurance, therefore my 
prediction is that XYZ will drop” may seem like a tautology, the implied pdf 
does produce a quantitative evaluation of the effect. 

Here is a picture of the idea for a lognormal pdf and a modified implied pdf 
including skew effects °. The fat tails coming from the increased volatility for 
low values of the underlying increase the implied pdf values at low values, with 
respect to the unskewed lognormal pdf. 


—— Implied pdf —e— Lognormal pdf 


Underlying 


If there is substantial real market pressure on the XYZ currency due to selling 
XYZ spot accompanied by feedback of bearish purchases of XYZ puts, the 
option market “prediction” will indeed come “true”. However the amount of such 
depreciation really relies on complex factors including transaction volumes, 
macroeconomic information, and local political conditions instead of theoretical 
peaks in the implied option probability distribution. To say the least, the impact 


10 Plot: Actually this plot comes from the S+P 500 index data in Ch. 4, and it serves to 
illustrate the 1dea generally. The implied pdf illustrated is just a lognormal pdf with the 
skewed vol put in at each value of the underlying and the total renormalized to 1. 
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of these real factors on the FX market is not easy to evaluate successfully and 
consistently over time !!. 

Still, option pricing including skew information is desirable since information 
from the option market is included, including the fear-factor distortion. 


Pricing Barrier Options with Skew 


Barrier options are options that change their character if the underlying variable 
hits the barrier value. Barrier options are discussed in detail later in the book. 
Skew effects can be important for barrier option pricing. Consider an up and out 
call option" on XYZ currency as a function of £ where the barrier is K+. For 


simplicity ignore smile and just assume skew. As £ increases along a path going 


near the barrier, the XYZ currency is appreciating and the local volatility 
decreases. This lowers the probability of knockout compared to the classic 
situation with a constant volatility independent ofthe Z level. 


The drawing below gives the idea: 


Equivalent Problem: Redefined Barrier, No Skew 


Replace Vol Skew by Redefined Barrier 
at К. +ӧК. with similar Prob(KO) 


Presence of Skew (Smaller o near K, ) 


Makes Barrier Seem Farther Away 


l! Trading: Hey, that's why good traders make the big bucks, right? 


12 Up-Out Option: This means that if the underlying goes above the barrier at any time 
before option expiration, the call option ceases to exist. 
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From the point of view of a drunk gremlin staggering along such a path 
defined by a given set of appropriate random numbers", the barrier at K z 15 


“hard to reach” in his skewed world. Equivalently to an observer, the gremlin 
gets less drunk as it approaches the barrier and staggers less, making it less likely 
to cross. We can replace this gremlin by a new gremlin in an unskewed world if 


we also replace the barrier K, with a new barrier К ee = К. * óK,. 


This is done such that the unskewed-world gremlin on a path with the same 


random numbers (but with the higher unskewed volatility) has the same 


UnskewedWorld 


probability of knockout with respect to this new barrier K, as does the 


skewed-world gremlin on his path with respect to the original barrier К. 


Although we cannot find one new barrier to force equivalence on all paths, 
we can do it on the average given the average barrier crossing or KO time 


Tko = 1 EE (cf. Ch. 17, p. 251). The approximate change in the barrier is '* 


К, =&, lo (&)-o(&) rio (5.15) 


Qualitative “back-of-the-envelope” ideas of this sort can be used to develop 
intuition, or possibly for quick estimations on the desk. Another use is to provide 
sanity checks for model code that 1s being developed, for debugging. 


Double Barrier Option: Practical Example 


Double barrier options are more complicated than single barrier options. Double 
barrier options have two barriers—one above and one below the starting FX rate. 
The simplest case is a double knock-out option, where the option ceases to exist 
if either barrier is crossed at any time to expiration. 

The formalism for double-barrier options is in Ch. 18. Here, we just give a 
practical example. Double barrier options are often used in FX. To give an idea, 


P? Diffusion and Drunks: The classic model for diffusion is a drunk staggering out of the 
bar. The drunk performs a random walk about the forward path. Mean reversion can be 
modeled by attaching one end of a spring to one leg and the other end to a rod along the 
forward path. A jump can be modeled by having a big gust of wind knock the drunk one 
way or the other. Long-term drifts can be modeled by sloping the terrain appropriately. 


" History: I developed this heuristic idea of moving the barrier to account approximately 
for skew in 1995. 
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here is a 2-barrier USD call example, complete with hedging information, 


presented in a spreadsheet-like fashion that you might see in a system 


15, 16 


Quantity Value 
Double KO call on USD 79,653 DM 
Standard call (no barriers) 249,846 DM 
Forward FX rate 1.52163 DM/USD 
Probability of knock out 44.3 % 

Delta = Spot * dC/dSpot 486,369 DM 
Gamma = 1% * Spot * d°C/dSpot” -561,443 DM 
Vega (Vol up 1%) -15, 529 DM 
Theta (1 day) 797 DM 

Rho (DM rate up 100 bp) 2,979 DM 
Phi (100 bp) -3,343 DM 


The probability that knockout occurs from either boundary is around 44 %. 
This probability decreases as the barriers are moved away. For example, if the 
lower barrier were 1.0, the probability of knock out drops by a factor of two. If 
one of the barriers is removed, the double-barrier option becomes a single barrier 
option. For example, the double-barrier KO option with the lower barrier dropped 
substantially turns into an up-out single barrier option. 

The fact that у <0 means that, as spot 7, moves up, the presence of the 


upper barrier lowers the option price and lowers delta. Eventually we will get 
A < 0 and the option price will go to zero as 7, approaches the upper barrier. 


The negative vega is also related to the barriers, because as the volatility 
increases, the probability of knock-out also increases. If the vol goes up 1%, the 
probability of knockout becomes 5595. 

The increases of 100 bp in the interest rates are very large for the 3-month 
period of the option. Still the option value changes only by around +-3,000 DM. 
This indicates the relative insensitivity of the FX option to the interest rates. 


'S Parameters: This DEM/USD example was in 1996 before the Euro. Spot = 1.5303, 
lower barrier = 1.45, upper barrier = 1.60. Strike = 1.52, vol = 8% (no skew), DEM rate = 
3.2366% ctn/365, USD rate = 5.4913% ctn/365, expiration = 92 days, notional = $10MM, 
no rebate. Dates: Settlement — Valuation, Payout — Expiration, no Tom-next. For the 
model I wrote, the symbols “DEM” and “USD” had to be exchanged along with the 
interest rates on the pricing screen. This sort of manipulation is standard. 


16 Numerical Accuracy Reporting: There is no way that the numerical accuracy of 
options models quotes should be believed to many decimals. You might like to construct 
your own code and check the numerical accuracy of my pricer results above. 
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Rebates 


A “rebate” may be part of the contract, providing for a fixed amount to the option 
holder in case a barrier is crossed. This rebate may either be paid at expiration or 
paid when the barrier is crossed (“at touch"). Rebates exist for single and double 
barrier options. A rebate amounts to an extra knock-in digital option. 


The “Two-Country Paradox” 


The two-country paradox is generated by a logical argument’. We have two 
equivalent notations for expressing foreign exchange related by 7 =1/é. It 


cannot possibly make any difference if we announce (ignoring commissions etc.) 
that 1 USD buys 10 MXN (7 210- MXN / USD) or that 1 MXN buys 0.10 USD 


(é = 0.10- USD / MXN ). For this reason, we should be able to make the change 
of variable 27 = 1/4 in the mathematical expression for any financial FX 


instrument without any physical change in the result. This, however, is not 
generally the case. The drift after the change of variable is inconsistent with 
interest-rate parity. Consistency with interest-rate parity for both variables 
requires a separate normalization for the drift for each variable. These separate 
normalizations are inconsistent with the direct change of variables 7 = 1/2. This 


inconsistency is the paradox. 
Mathematically, we can express the paradox noting that for any Gaussian 


random variable W we have (exp (W)) = exp (w) + 1/2| (w° ) – (wy } . We 


first set W, = Ind. and then set W, =In7 =—W,, both assumed to be Gaussian 
with constant volatility с. We get at time t , after time interval т, the result 


(EONO) = ехр| 7 | (5.16) 


This however is inconsistent with interest-rate parity (IRP) for which the right- 
hand-side would be 1, not exp [o r| ; 


(EO) sep = Š ma = So exp| (7, -r,)r| 


(5.17) 
VO rp = Пы = Mo EXP Z E ) r| - (50), 


Since IRP is fundamental, it is necessary to define the drift for each variable 
separately, consistent with IRP, and inconsistently with 7 = l/£ . It further adds 


to the confusion that the change of variables 7 = 1/2 in the standard FX option 
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formula expressed in terms of С does in fact lead to the standard option formula 
expressed in terms of 7 . This is due to a fortuitous cancellation. The cancellation 
does not occur for digital options or rebates for example. 

А common workaround procedure is to price instruments with &-specified 
payoffs using € dynamics and price instruments with 77-specified payoffs using 


7] dynamics, never allowing the substitution 77 = 1/ i. 
It is not clear to me whether realizable arbitrage exists for this paradox. 
However, consider one system S$ E that, for bookkeeping purposes, expresses а 


given portfolio in terms of č. Consider another system 5, that chooses to 
express the same portfolio in terms of 7. Say that the two systems used Monte 
Carlo simulators, one in Inc and the other in In77, with the same random 
number generator and the same starting seed, along with an extra minus sign for 
changes in 1n y. For the system S$ pa point &(ї) on a given path is equivalent to 
n(t) =1/€(t). However, this point y(t) would not be attained on the 
corresponding 7 path for the system Sy- Again, this is because of the mismatch 
in the drift for ату using the change of variable relative to interest-rate parity. 


For this reason, exercise conditions, barrier conditions, rebates, etc. would not 
match up between the two simulators, and the results would differ. Calculations 
depending on confidence level would differ because the two simulations differ. 
Therefore risk measures depending on a confidence level will differ, depending 


on the convention of reporting using 77 or & id 


To what extent do other markets have similar problems? Consider interest 
rate dynamics where a physical rate r(f) is derived from a change of variable 


from another variable y(t), e.g. lognormal, y=Inr. The term structure 
constraints from zero-coupon bonds determine the drift of the physical rate r(t). 
Using the change of variables, the drift of the variable y(t) is determined. No 
separate term-structure normalization exists for the drift of y(t). 


17 Martingale Disguise: This is the “martingale disguise”. Mathematicians proclaim that 
the requirement “should” be that the drift is determined for each variable using its own 
martingale condition. This just says that the drift of each variable should be determined 
by interest rate parity. The inconsistency with the change of variables n = 1/6 remains. 


'S Use of one single FX Convention: A good example of what I am talking about is in 
Riehl & Rodriguez, p. 43, problem 2.12, answer (Ref). 


Who would notice? Naturally if the two systems using the two different conventions 
to generate two different values for the same risk were in different firms, no one would 
notice. However if there were different offices in different parts of the world reporting 
risk using the different conventions (e.g. using č and reporting risk in USD in NY and 
using 77 and reporting risk in EUR in Paris), there would be ambiguity and confusion. 
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However interest rates have a different problem. As discussed in Ch. 43, even 
if term structure constraints are satisfied, there is still arbitrariness in the drift of 
the stochastic process building up the probability density; this arbitrariness in 
pricing is compensated by a “numeraire” factor multiplying the probability 
density. While there is no ambiguity in pricing (requiring both numeraire and 
probability density), there is ambiguity in risk defined using an interest rate 
simulator with an arbitrary drift in the probability density. 


Quanto Options and Correlations 


Stock options with FX characteristics exist, called quanto options. The basic idea 
is that a stock is purchased in one country with its local currency (e.g. JPY) and 
the stock hedge in JPY is managed from a desk in another country with a 
different currency (e.g. USD). The USD investor X who buys a quanto pays an 
extra premium and is contractually insured against FX risk. The FX risk appears 
in the quanto option price. The critical issue here is the correlation 
Psiock ry between the movements of the stock price and the movements of the FX 


rate”. It can be proved rather generally that the quanto effect can be taken into 


account by modifying the stock-price drift”'. Equivalently, for standard quantos, 


the stock dividend yield 4%, can be changed to an "effective" stock dividend 


yield T эы , according to the prescription 


Eff — „Stock ET 
Q pivyiela. T  pivYield P Stock,FX O stock 7 rx (5. 1 8) 


In Ch. 22-25 and Ch. 37, we discuss correlation risk in detail. A major 
problem with quanto options revolves around the instability of the correlations 
over the time scales of interest for hedging and reporting. 


? Correlation Sign Convention — Watch Out: The sign of the correlation needs 
attention. The example corresponds to the correlation between relative moves of dS/S and 
ат/т. The correlation between dS/S апа 46/6 has the opposite sign. 


?' Quanto Options: The idea is to use a two-factor lognormal stock + FX model and 
integrate out the FX degrees of freedom to obtain an effective one-factor stock model. 
This can be done if the payoff only depends on the stock price locally (e.g. in JPY). The 
n-factor model formalism is discussed in Ch. 45. We leave the application of this 
formalism to this section as an exercise for the reader. 
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FX Options in the presence of Stochastic Interest Rates 


We have exhibited the standard formalism of FX options in the presence of a 
deterministic term structure of interest rates. This means that the domestic and 


foreign interest rates 7, ,7, are numbers, dependent on the time to expiration, but 


are not random. The incorporation of stochastic interest rates implies a three- 
factor model (& or 77 and у, ‚т, ), as was recognized early on”. The numerical 


analysis and technology are involved. Most of the risk of FX options is due to the 
FX rate. Some idea of the contribution of the interest rate fluctuations can be 
obtained from the rho and phi sensitivities, along with simple scenarios along 
outlier paths for the interest rates. 


Numerical Codes, Closed Form Sanity Checks, and Intuition 


Exotic FX options, options with FX components, bonds with FX-related payouts, 
etc. form a rich tapestry that go beyond the scope of this book. Along with 
American options and options including skew, no closed form solution may exist. 
Techniques of numerical codes for diffusion are then used. These include: 


e Binomial or trinomial discretization 
e Monte Carlo simulation 
e Finite difference or finite element analysis 


Descriptions of these numerical methods are well documented, and we refer 
the interested reader to the literature ". 

The choice of the type of numerical approximation depends on the analysis. 
Barriers are more difficult to represent in some schemes than others, for example. 
Monte Carlo simulators, while perhaps the most flexible, need careful attention to 
hedging parameter stability. 

When developing numerical code, it is always important to have the ability to 
look at approximations based on closed-form solutions or simple modifications of 
them. It is difficult to interpret numerical code output physically, and physical 
intuition is extremely valuable. This is because: 


e Intuition is behind the language traders, salespeople, and risk managers use 
and it's a good idea to be able to communicate physically why your code is 
producing the results it is. 

e Sanity checks provided by closed form solutions are useful in code 
development. In particular, if your code output differs by say 30% from a 


? History: For example, the three-factor FX option formalism is in my notes from 1989. 
Actually most of the risk is from the FX rate, not the interest rates, so many people have 
not gone to the trouble of including stochastic interest rates in their FX models. 
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closed-form approximation, you are much more confident that there are no 
bugs in your code than if the @#*$&*% black box code puts out a result 
different from the approximate sanity check by say an order of magnitude. 
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6. Equity Volatility Skew (Tech. Index 6/10) 


In this chapter, we consider volatility skew for equity options. We also include 
some formalism and simple skew models aimed at providing physical insight. 
Volatility skew refers to the strike dependence of the volatility. For example, 
some S+P 500 index option volatility data as a function of the strike Е using the 
Black-Scholes model are shown below!: 


Vol Skew Example 


—— Call, Put Avg Vol 
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E = S+P Option Strike 


What this graph indicates is that if a single volatility is put into the usual 
stock-option Black-Scholes formula, then in order to reproduce the market option 
prices this volatility decreases as a function of the strike of the option. Physically, 
this condition seems imply that “fear is greater than greed”. Thus, a premium 


! Data: Data are the averaged call and put midpoint DEC 1997 S&P index option vols on 
5/14/97 when spot was 837.54. Nothing special happened on that day. We thank 
Citigroup for the use of these data. 
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for OTM puts with low strikes to protect the downside exists [fear], relative to 
OTM calls at high strikes to participate in the upside [greed]. Also, for that 
reason, skew increases in importance after severe market perturbations’. 


Put-Call Parity: Theory and Violations 


Call and put option payoffs at option expiration satisfy the obvious relation? 
(s E ) (£ s`) = (s E ) The “Put-Call Parity” statement for 
+ + 


European options both with strike E is a direct consequence. It says that the call- 
put European option price difference today is the volatility-independent forward, 
Coat С, =" (Sy, E) (6.1) 


put 


Therefore, call and put options theoretically have the same volatility. This is only 
approximately true. There is a strike skew effect, shown below’: 


Call vol - Put vol 


Vol difference % 


5 

0 

5 
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? Notation: (S* - Е), = S* - E if S* > E and 0 otherwise. 


? What's the blip at E = 700? Put vols were anomalously low at this point, even below 
put vols at 720. Should we take the market at face value and put in the spike, or smooth it 
out? You be the judge. Just another example of “Dealing With the Data”. 
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Skew with Less Liquid Options 


The complexities with skew are heightened for less liquid options. The simplest 
exotics are S+P barrier options. Even these relatively simple exotics are highly 
illiquid and little skew information is directly available. It is therefore common 
practice to model the skew, using a model for barrier options consistent with the 
skew information from the standard options market*. Options on individual 
stocks usually do not have enough information even to construct a volatility 
surface (see below). Skew from the S+P index might be added on to the single- 
stock vol, with consequent vol basis risk between the S&P and the stock. Skew 
uncertainties are compounded for options on baskets of individual stocks. 

Therefore, skew is only well defined for non-exotic index options. However, 
the word “stock” will be used in this section instead of “index” for clarity. 


The Volatility Surface 


Before proceeding further with skew, we can consider different expiration times 


t . If we put all volatility information on the same graph and fill in the gaps, we 


would get a surface o (E n called the “volatility surface" or “vol surface" "E 


^ Philosophy of Skew: We may feel that we are discovering something fundamental 
about the stock-price process from the S+P standard option market, in which case we 
would naturally want to use this same process to price all S&P options, including barrier 
options. A slightly less religious position, though equally pragmatic and leading to the 
same activities, would simply state that it at least gives more confidence to have a model 
that prices simple options correctly before tackling complicated options. 


> The Volatility Surface: The volatility surface is not easy to parameterize. First, the data 
for different maturities do not cover the same ranges in strikes (short-dated options have a 
more restricted range of strikes than long-dated options). Second, away from the money 
the options are more illiquid, the bid-ask spreads increase, and the vols generally have 
uncertainties. These uncertainties are partly due to the decreased option vega or price 
sensitivity to the volatility away from the money. Third, the market for some parameters 
may be one-sided (on this date there were no bids for DEC calls below 550, and no bids 
for DEC puts below 475). Fourth, technical supply/demand effects appear that are not 
constant with respect to (E, t*). The vol surface therefore sometimes has anomalies, and 
these anomalies can cause difficulties with the option models. 


* Deep OTM Option Vols: A deep out of the money option has little volatility 
dependence because it is very unlikely to be exercised, so volatility loses meaning. In 
addition, there is an “administrative” charge for systems etc, that has nothing to do with 
volatility. Finally OTM options can be illiquid. This can produce vol surface difficulties. 
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There are two requirements for the volatility surface. The first is that variance 
о? (E г) T should increase with time to expiry T at fixed strike, so local 


variances are positive’. The second is that the curvature of the option price with 
respect to strike, which theoretically is proportional to the pdf, should be positive; 
cf. Eq. (6.10). 


Dealing with Skew 


There are six methods to handle skew. 

Skew Method 1. Perturbative Skew: Use a standard model with a prescription 
for skew modification of the standard model. 

Skew Method 2. Static Replication: Use a combination of standard options 
that approximately replicates a boundary condition, e.g. at a barrier. 

Skew Method 3. Stochastic Volatility: Assume that the volatility is itself 
stochastic with an assumed process including a “vol of vol" noise term. 

Skew Method 4. Local Volatility Function: Local volatility here means to 


make the volatility at time t a function of the stock price O gewfi ls (t).t | such 


that the S+P option prices are fit, replicating the skew. * 

Skew Method 5. intuitive Models: These are called “sticky strike” or “sticky 
delta / sticky moneyness”; they are sometimes used to describe the stock-price 
dependence of the skew as time marches on. 

Skew Method 6. Jump Diffusion: These models parametrize jump processes 
that are included along with Brownian motion. The jumps modify the effective 
volatility. 


Perturbative Skew and Barrier Options 


A possible perturbative skew approach with an up-and-out (UO) call option is 
illustrated. The idea’ is to start with the standard barrier option model. Then a 
skew correction is added perturbatively such that the boundary condition, 
terminal condition, and limiting properties are maintained. 


Some requirements for a reasonable skew approach are: 


7 Local Variance: Consider expiration times t, and tp with tab = ty — ta >0. Then the local 
variance, a positive quantity, is defined by o”(E,ta)*ta = o^ (E,t))*ty — o^ (E,t;)*t, 


* History: I wrote down Girsanov's Theorem with local vol using the path integral in the 
late 1980's. See Chapter 42, Appendix 1. 


? History: I developed this particular perturbative skew approach in 1997. 


Chapter 6: Equity Volatility Skew 63 


e Тһе option vanishes as the stock price approaches the boundary S, >K 


e The standard call, with the correct volatility, is recovered as the boundary is 
removed ( K — oo) 


* 


e Тһе terminal payoff condition at expiration time г, is not modified ^. 


mat 
A perturbative approach to include skew (the “Т decomposition formula) 
similar in spirit but different in detail, was suggested by Taleb ". 


Consider an up-out UO call. Assume strike Е , barrier K , time to expiration 
* * 


or maturity 7 At expiration, t„„ the payoff (intrinsic value) looks like this: 


mat 


mat * 


Bare replicating portfolio for Up & Out Call 
(intrinsic value shown for simplicity) 


Long call, strike = E 


Short call, strike = K 


Short digital, strike = К 


The skew correction in this approach is generated by the skew of a 
“replicating portfolio”, as shown in the figure. At expiration, this portfolio 
replicates the barrier option. In a rough sense, the portfolio also approximates the 
barrier option before expiration. The portfolio has a simple form (two calls and a 


digital option). Only vols at the strike E and barrier K are needed. 


10 Notation: The subscript “mat”, short for option maturity, is there because other 
intermediate times will show up below. 
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Again, the idea is that the skew of this portfolio is used in order to generate 
the skew of the barrier option. The procedure is in App. A. 
Other single barrier options can be treated similarly, or using sum rules ''. 


Some Numerical Results for Barrier Options with Skew 


While naturally not identical, the results of this model are qualitatively similar to 
other approaches that have skew. For example, an up-out S+P call option? was 
analyzed with the following results: 


UO Call Price with No Skew (zeroth order approximation): 


(S vol, No Skew - 55.8 , CU vol, No Skew — 59.3 


academic academic 


UO Call Price with Different Skew Models”: 


C 


Perturbative Skew 


= 61.2, (C Local Volatility E 6140.5, C, 


MC Simulation 


= 60.8 


tatic Replication 


It is seen that the zeroth order approximation is better with the spot vol than the 
strike vol. The results including skew using three different model approaches are 
similar. We have included results from a MC local volatility simulation (see 
below), and from Derman's static replication approach, to which we now turn. 


'' Skew for other barrier options: Other single barrier options can be obtained using 
sum rules. For example, the down and out (DO) call can be obtained from the DO put and 
the DO forward. Up and in (UI) options can be obtained from the standard options and 
the DO options. Double-barrier option skew can be obtained approximately by using a 
perturbative approach modifying the standard double-barrier closed-form solutions by 
skew corrections involving a limited number of images (e.g. one image for each barrier), 
along with sum rules. 


? Barrier option example - parameters: Strike 800, barrier 1,000, spot 940, time to 
expiration 0.3 yrs, rebate 64 paid at touch, strike vol 35%, spot vol 27.5%, barrier vol 
24.5%, interest rate 6 % (ctn365), dividend yield 1.7% (ctn365). Parameters rounded off. 


? Calculation Details for the Perturbative Skew Approach: The academic model 
(including rebate) with spot vol was used for the 0" order approximation. Multiplicative 
skew corrections with averaging over knockout times were used. The rebate was included 
in the digital option for the bare replicating portfolio used to construct the skew. A call- 
spread approximation was used for the digital. I thank Tom Gladd for assistance in the 
calculations. 
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Static Replication 


Static replication" is a clever way of approximating a complicated path- 
independent option by a replicating portfolio consisting of simple options. This 
set replicates the payoff and boundary conditions of the original option. Because 
payoff and boundary conditions uniquely determine any option once the 
stochastic equation for the underlying variable is given, the replicating portfolio 
can replace the original option. Derman noticed that this can be achieved as a 
function of time, so once the replicating portfolio is chosen, the same portfolio 
remains a replicating portfolio provided that the parameters in the stochastic 
equations do not change. 

The inclusion of skew using this approach is carried out by putting skew into 
each of the simple options in the replicating portfolio. Because the options are 
simple, the job of finding their skew is in principle relatively straightforward. 

In general, the replicating portfolio has an infinite number of options; in 
practice, this is approximated by a finite portfolio. Therefore, the achieved 
replication is only approximate. 

We next give a summary of Derman’s static replication method using path 
integrals and Green functions for a continuous-barrier up-out European call 
option". Although the allowed region for paths contributing to this barrier option 
is below the barrier, the idea is to replace the existence of the barrier by a 
replicating set of OTM call options that have no payoff for any path below the 
barrier. The OTM call option payoffs above the barrier are propagated back using 
the standard Green function without the barrier. That is, an equivalent problem 
with no barrier is used to solve the original problem with the barrier. This is 
similar to using images; here the images can be thought of as the set of OTM call 
options replicating the zero boundary condition along the barrier. 


The UO call? С (VO cat. (r^) is equivalent to an ordinary call QU both 


expiring at time ¢ with strike E < К, plus the replicating portfolio V to 


replicating 


enforce the boundary condition at the barrier 5 =Ink | at times Uu An 


OTM call C, = cles) in V. 


replicating 


has fixed strike E, 2 К, expires at t, < Ü 


with payoff С = С, Can and has weight w,. The weights iw] are chosen 


" Path Integrals and Barrier Options: See Ch. 17, 18 for barrier option formalism, 
path integrals and Green functions in Ch. 41-45. The standard Green function for a 
constant volatility has a Gaussian or normal form in the x variables. Skewed vol input is 
used. You do not have to know anything more to follow the logic in this section. 


PNotation: The superscripts specify the option parameters (“which option") and the 
variables in unelevated parenthesis indicate where and when the option is evaluated. 
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to enforce the zero boundary condition!®. At f "m V consists of those {С " 


replicating 


with f, >t, that have not expired. Exhibiting the variable х = Ins, <Ink, 


below the barrier if not knocked out '’, and noting C, Я = с“ E) (кї, | 


ааа (551) — У w,C, (356) (6.2) 


і (21у 


We write the standard Green function for propagation to time ¢, back from time 


t, >t, in the absence of the barrier, including discounting, as G(x ШОРЛУУ ). 


Then the OTM call C, (x il; ) is given by the discounted expectation integral 


с, (х) [а Git, t5) C (63) 


Although the region of integration is over all values of x, the integrand is 


nonzero only above the barrier, x, > In E, >1п К. We now get the weights 
{w,} backwards in time, setting (c+y, T J(s, = тки) =Q at each /.. 
replicating j j J 
This concludes the discussion of static replication. We now turn to the 
stochastic volatility method for including skew. 


Stochastic Volatility 


Stochastic volatility is a natural idea. First, casual examination of a given 
historical time series over different time periods for a given window size usually 
produces noticeably different standard deviations, dependent on the window. 
These differences in standard deviations cannot be explained by the trivial 
observation that the window size is finite, and that some statistical finite-sample 
noise must be present in the standard deviation. 


^ Arbitrary replicating payoffs: Actually any payoffs will work that do not occur 
below the boundary and that allow a solution for the weights. The strikes of the OTM 
calls are theoretically arbitrary, so long as they are above the barrier. 


"Physical Region: We can ONLY look at the UO option with x; below or at the barrier, 
never above the barrier. This is similar to the use of images in electrostatics. 
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Because the option volatility is (philosophically) supposed to be the market 
perception of the future underlying standard deviation, it is logical to postulate 
that standard-deviation instabilities will show up as instabilities in the option 
volatility. These volatility instabilities can be modeled in a stochastic volatility 
framework using a “volatility of volatility" or “vol of vol". 

Instabilities in the option volatilities are also evident from the time 
dependence of the implied vol for a given option. The implied vol generally 
jumps around from one day to the next, as market supply/demand conditions 
change. If the distribution of implied vols is plotted and the standard deviation 
measured, we have an approximation to the vol of vol for that option. 

Stochastic volatility is often assumed to exist for risk management purposes. 
For scenario analysis, VAR'* and other risk measures, the vol of vol is important 
because it represents an extra vega risk that can contribute significantly. 

A simple and pedagogically accessible model of stochastic volatility is now 
presented". More sophisticated frameworks were developed by Hull and White, 
and by Heston where a stochastic process for the time dependence of the 
volatility is postulated. The SABR model is popular. Carr gave a comprehensive 
review". The model presented here is equivalent to assuming that movements in 


the local volatility in all intervals (t,¢+ dt) are equal, with this uniform 


fluctuation chosen at random. That is, the local volatility fluctuations are parallel 
as a function of time. This model is transparent and sophisticated enough to 
exhibit the idea of stochastic volatility. 


We postulate a probability distribution function (pdf) 2 [с] for volatility с. 


For example, this volatility pdf could be a Gaussian distribution centered around 
some “renormalized” volatility ср with vol-of-vol width A, , between limits for 


сє (or — À,,0, t Ay ) , for some cutoff parameters A, , A, . We write 


oO 


[e] Aen -z le-o] | (6.4) 


Here the normalization WV = М(А зо А, ‚А„) is such that [21% =1; 


Consider а European call option with strike Е with the volatility с dependence 
made explicit (we include the discounting factor in the Green function): 


5 VAR: VAR is an acronym for Value at Risk, and is discussed in Chapters 26-30. 


? History: I constructed the stochastic volatility model in the text in 1986 after Andy 
Davidson pointed out the stochastic nature of historical and implied volatilities for T- 
bond futures. The original idea, also due to Davidson, was to find an average volatility 
with more stable properties. 
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C(%.f)303E)= [ ас, (2-2, ше) ew(x )-£| (6.5) 


In E 


This just produces the usual Black-Scholes (BS) formula. If we now assert 
that the volatility takes value с with probability 2 [c] in do , we can define 
the volatility-probability-averaged call option C^'*-? as 

O g - At 
СЕ) Í do|o]:C(x,.4;o;E) (6.6) 


Og-Ar 


Clearly, this C^*-^ has a different strike dependence than for the volatility un- 
averaged result. If we now try to interpret C^*-^ as being given by the BS 
formula, we will need to compensate for the volatility averaging by placing a 
strike-dependent skew volatility Osten (E )into the Green function in the BS 


Avg o 


formula such that we get the same numerical result for C as above, viz: 


Cree (xost; E) = j dx б, HE m d — 193 O Skew (Z) || exp(x")- 2] 


InE 


(6.7) 


In this manner, stochastic volatility generates a skew dependence сс, (E ) А 


o 


We could also define a volatility-averaged Green function G^*-^ as 


O g Ap 
Get Ci EXE -t&) = f do |o]: G, (x Et d -ю;о) (6.8) 


Op—A, 


So also 
Cm (gab) агане (еса Доо) 8] фә) 
ЊЕ 


Note although the above two different integrals produce the same C^*-^, 


the integrands are not point wise equal. Some technical comments are in App. B. 
This ends the discussion of stochastic volatility. We turn to local volatility. 
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Local Volatility and Skew 


Local volatility (or local vol for short) is the stl in Я stochastic equation 
for the stock-price relative changes [5( t+ dt) - (£)] [S(t S(t in the time interval 


(t, t+ dt) . The technology of local volatility methods was by Derman 


et. al. and by Dupire “. It seems that these methods are now preferred in the 
industry for incorporating skew for equity options. 
We can use path integrals'* to discuss local volatility. The path integral is 


well formulated for a local volatility o;, determining the infinitesimal stock 
price diffusion at time £f, which can depend on the stock price 
Os, = o| S(t),t]. This is because each stochastic equation in (t,t +dt) that 


depends on the local volatility ©су, is inserted as a constraint into the path 
integral. 

If the each integral in the path integral at time f, is discretized into N, points 
and there are N, forward times in the time partition, then there are №; №, total 
points in the discretization, and by construction, there are the same number 
N,N, of local volatilities. Explicitly, at each expiration time ¢, (/ = 1...№,) we 
specify №, European option prices with different strikes fE Я (t, )} (с =1...№,). 
We can then numerically invert the equations determining the option prices 
[я (t, )] for the local volatilities {o(S, 4 ) in the diffusion process. It is 


not necessary that the strikes E. and the discretized lattice prices S. be equal. 

A direct way to organize the inversion uses the Fokker-Planck equation, 
shown by Dupire " 

A simple proof of Girsanov's theorem including local volatility was done 
earlier by Dash in the late 1980’s ”. 


Analytic Results for local vol in Perturbation Theory 


Analytic results can be obtained for local volatility in perturbation theory”. The 
formalism is advanced and requires path integrals (c.f. Ch. 42, Appendix D). 


20 Girsanov and local vol: The simple proof of Girsanov's Theorem with local vol is in 
Ch. 42, App. A. 
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The Skew-Implied Probability Distribution 


The discounted expected integral for a European equity option with strike E. 


and exercise time ¢, can be differentiated with respect to the strike of the option. 
As discussed in Ch. 5, it is easy to see that the Green function is thus obtained 
with final co-ordinate x, — In(£ 7 (t, )) . Specifically, (exhibiting only strike, 


И 9 : ; : 21,22,23 
exercise labels), and including discounting, we get ^^^ 


maso =G, E 
al E, (4) | 


While this relation has been known forever, the recent application is to turn 
the statement around backwards to derive an empirical Green function G 


empirical 


= (е, (0))| (6.10) 


а 


or probability distribution function pdf from the strike curvature of the option 
prices, as determined from market data. This С, is often plotted to exhibit 


visually the deviation from the standard lognormal pdf. Naturally, numerical 
issues arise in finding the option strike curvature since the options are only 
known at discrete points, market anomalies (supply/demand) exist, etc. 

Dupire and Derman showed that a diffusion equation is satisfied by the Green 
function written as a function of the strike. 


Local vs. Implied Volatility Skew; Derman’s Rules of Thumb 


An implied volatility, roughly speaking, is an average over local volatilities of the 
diffusion process all the way to expiration. This average is specified for an option 
of a given strike as the (single) number that has to be inserted in a Black-Scholes 
formula for the volatility to get the market price. The local volatilities depend on 


?! Local vol, Option curvature, and the Green function: This relation holds even if we 
have local volatility. This is because the Green function is not touched in the derivation, 
other than fixing the variable x; as indicated. 


? Curvature constraint: Note that because the Green function is a probability density 
and so is positive, the curvature of the option price as a function of strike must also be 
positive. This puts a constraint on the numerical inversion procedure. Other constraints 
(e.g. a longer dated European option is more valuable than a shorter dated European 
option with the same parameters) also exist. See Ch. 5 for details. 


? Swaptions: Swaptions are options on forward swaps. A similar formula to Eq. (6.10) 
holds for swaptions, with the time to swaption exercise specified, with the forward swap 
time length fixed as a parameter. See chapters 8 and 11. 
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the stock price and the time, and they specify the diffusion process over small 
time steps. For this reason, the local volatility dependence on the underlying price 
and the implied volatility dependence on the strike are not equal. 

The picture below shows implied and local volatilities, illustrated with two 
paths. 


Implied and local volatilities, illustrated with 2 paths 


Path 2, smaller o (^? 


Smaller po 
Bigger prm 


(local) 


Path 1, bigger Су 


Derman’s First Rule of Thumb 


Derman’s *1* Rule of Thumb" says that local volatility varies with the stock 
price about twice as rapidly as the implied volatility varies with the strike". The 
implied volatility is an average of the local volatilities along the paths 
contributing to the option payoff. For a rough approximation, use # the sum of 
the initial and final local volatilities. Then if the implied vol increases, the final 
local volatility has to increase by a factor 2 to cancel the averaging ^. 

Consider two call options with the strikes indicated. Paths like path #1 


contribute to C 


call 


(E) with bigger implied vol, but they do not contribute to 


Gui (Е,) with smaller implied volatility. Therefore, paths like path #1 must 


have a bigger local volatility. Similarly, paths like path #2 must have a smaller 
local volatility. So, as the stock price drops the local vol increases. Similarly, as 
the stock price goes up the local vol decreases. We could have done the analysis 
with put options. The bigger local vol for lower stock prices also reproduces the 
put options at lower strikes made more expensive by the “fear” factor. 

We need to consider what happens to the skew of an option as real time 


moves forward 2“. Now the dependence of the local volatility a|s (t).t | on the 


stock price at forward times—viewed from today—can be determined today by 


2 Crystal-Ball Analogy: This is similar to imagining the behavior of an option along 
future worlds in future Wall Street Journals, viewed through a crystal ball today. 
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the set of options today. Because the local volatilities determine the diffusion 
process completely, the behavior of the implied volatility for a given option is 
then theoretically determined as real calendar time moves forward. Thus, if on 
Jan. 1 we determine the local vols from option data, then the values of implied 
vols on Feb. 1 are predicted once a given stock level on Feb. 1 is specified. 
Theoretically, at any point along a given stock price path, we can predict the 
value of any option and thus any implied volatility. 


Derman’s Second Rule of Thumb 


As time changes, the stock price will change. Derman’s “2™ Rule of Thumb” 
says that the change in implied volatility of a given option for a change in stock 
price is approximately the change in implied volatility for a change in strike. This 
is reasonable since the price and the strike are in a sense dual variables. 


Derman’s Third Rule of Thumb 


The hedging of options is affected by skew. Because the skew slope is negative, 
the local volatility decreases as the stock price moves up. Therefore, in that case, 
a call option increases less than it would in the absence of skew. Thus, А, is 


reduced by skew. This reduction is the negative change in С, produced by the 


negative change in volatility с. for the given increase S . This is Derman's 


call 


*3" Rule of Thumb”; the reduction in A, is 
call 


= OC cat ÔO ui Са до 
os бо 


call. ы Vega*|Skew Slope| (6.11 
m ga *| pe| (6.11) 


call 


Option Replication with Gadgets 


Derman and co-workers observed "" a clever identity for an option expiring 
at some [= t, in terms of a weighted integral of options with different strikes 


expiring at t-dt-t 41+ Derman called a “gadget” the difference of the option 


and the weighted integral and proposed that the identity be used as a method for 
hedging an option with a set of other options. Because the hedging can be done 
locally in time, the methodology hedges local volatility. Of course, the utility of 
gadgets relies on the existence of gadgets in the market. We express an option 


that expires at some time ¢, with strike E, as a weighted integral over options 
that expire at earlier time ¢,_, with different strikes psu А 


Here is the picture: 
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Gadget Logic: Option at г, = У, Options at г, , 


iEn 


«—————— 


Here с looks like a discrete label (and for the trinomial model æ = 1, 2, 3) 
but will actually be continuous for our derivation, i.e. we will integrate over the 
variable E, ,. We use the subscript L because Derman's idea is to use gadgets at 
any time in the future. 

Note that this is not "static replication". As we saw above, static replication is 
concerned with the determination of a set of options that can be used to replicate 
a boundary condition, e.g. a barrier for an up and out call. 

Derman used a trinomial model to illustrate the gadgets. We can derive the 
gadget identity using path integrals and Green functions. We need to evaluate the 
various options. Since we use lognormal dynamics, we set y,, = ҺЕ, |, 


x, , 7]nS, ,, etc. Denoting C аз а call option, the gadget definition is 


Gadget (S, t, ) = C(E,,t,)- 1 dy G® а Osa 


(6.12) 
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Here the weight function GY ( YoY adt m is written in terms of the 
strikes, and will be determined to make the gadget value equal to zero (so that the 
gadget looks like a hedged portfolio). It will turn out that Gg’) ( Уе Уга ) 
is closely related to the usual Green function G,_, , written in terms of the stock 
variables for the last propagation between f, , and f,. 

The usual Green function G, , , propagating between the current time and 


the time £, , is common to all the options. We get 


Gadget (S,,t)) = | dx, буза {1-0} (6.13) 


—oo 


Here the integrals / and // are produced by the two terms in Eqn. (6.12) 


i= | dx,G, „(е -e| (6.14) 


—00 


П = J ae" (жез, (e -ен) (6.15) 


+ 
—00 


We want the equality of the integrals / = J , making the gadget = 0. Evidently, 
we want a Gaussian in the strike variables for G® ( YoY a’ dt ya to make it 
look like G гле » Чр to some normalization factor. We also have a different 
variable of integration. We make the change of the dummy variable 
Vey =X, +X; +y, . After a little algebra, we obtain the identity 7 = M, 
provided we set 


E 
Gi) y, s dt, \= eid =, 
(», Yı а) Е 


1-1 


E irm (n E о2 Р), | 


2 
20; ud. 


: y; €Xp 
(2707 dt, .) 
(6.16) 


Because y, — y, , =X, —x, , from the change of variable, we find 
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-r, dt E 
с" (еа jae € Ба (6.17) 
4-1 


Noting that dy, , = ЯЕ, | /E (., and dx, =dS, / S, we can cast this equation 
into a notation closer to Derman’s. This ends the discussion of local volatility. 


Intuitive Models and Different Volatility Regimes 


The heavy artillery of the local volatility described above contrasts with two 
simple intuitive models for the movement of the implied volatilities with changes 
in the stock price. These are called the “sticky-strike” model and the “sticky- 
delta" or the closely related “sticky moneyness” model. 

Derman advocates the use of the different volatility models under different 
regimes of realized stock volatility and he has analyzed the different time periods 
defining these regimes from S+P volatility data '. 

Derman suggests that the local tree model is appropriate when the realized 
volatility takes a sudden jump and the market moves downward rapidly. 


Sticky Strike, Sticky Delta, and Sticky Moneyness Models 


The "sticky-strike" model assumes that the implied volatility for a given strike 
remains unchanged regardless of the change in the stock price. For this reason, 


A(E ) for a given option of strike E does not contain the extra term 6A 


discussed above. Derman proposes this as a reasonable model in case the market 
is trending sideways without change in realized volatility, so that keeping the 
implied vols for existing options unchanged is a good match. 

The "sticky-delta" model assumes that the ATM volatility remains 
unchanged regardless of the change in the stock price. The “sticky-delta” 


therefore refers to the ATM delta, i.e. to an unchanged A ‚у of the ATM option. 


The delta A(E ) for a given option of strike E has an extra term ôA with the 


opposite sign from the dA of the local volatility model. 
The "sticky-moneyness " rule is related to the sticky-delta rule. Moneyness is 
defined as the ratio E/S . Thus an ATM option has moneyness = 1, so constant 


ATM vol means constant moneyness vol, at least near moneyness = 1. 

Derman suggests that the sticky-delta or sticky-moneyness rules are 
appropriate for hedging when the market is trending with a constant realized 
volatility. The constant realized volatility about the trending average corresponds 
to the choice of constant implied volatility at and near to the ATM point. 

This ends the discussion of the intuitive models and the use of the various 
volatility models in different regimes of stock price movement. 
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The Macro-Micro Model and Equity Volatility Regimes 


It is interesting to note that the various regimes as characterized by Derman 
(trending, range-bound, jumpy) seem to correspond to the characteristics of the 
Macro-Micro model discussed in Ch. 47-51, as applied to equities. In the Macro- 
Micro model, we have quasi-random drifts or slopes corresponding to trending 
(with positive or negative drifts) and range-bound behavior (small drift), all with 
limited volatility. Jumps can exist, playing a role in the micro component. Skew 
in the Macro-Micro model can be accommodated using the stock price in the 
probability distribution functions for each of the components. 


Jump Diffusion Processes 


Andersen and Andreasen have examined the effects of jump diffusion processes 
on generating volatility smile effects. We refer the reader to their paper” and the 
references therein. 


Appendix 1: Algorithm for “Perturbative Skew” Approach 


The steps in the perturbative model described in the text follow. Although the 
procedure might seem complicated, it is easy to implement numerically: 
Step 1. Start: Use the standard “academic” UO barrier call formula? 


ж 


eU (S; 25) as the zeroth order approximation. The volatility бү is 


academic mat ? 
. . . . . * 
chosen phenomenologically to minimize the skew correction (the ATM т, vol 


works well). Here $ о is the current (spot) price. 
Step 2. Construct the “bare replicating” BR portfolio”® for the payoff of the 


UO call at some intermediate time г, CEK |$; (t) ox ü )| . This 


BareRep 
BR portfolio will not be used to approximate the barrier option but its skew 
dependence will be used to construct the skew correction perturbatively by an 


add-on to the standard UO barrier formula in Step 1. C ma with skew has three 


? Barrier Option Academic Formalism: We will discuss this in Chapter 17. 


°° Derman's replicating portfolio: The "replicating portfolio" here is not the same as 
Derman's replicating portfolio. Derman’s replicating portfolio is exact but the 
composition of his replicating portfolio is complicated to construct because it relies on a 
back-chaining algorithm. 
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pieces: long call [strike Е, vol o; (c) short call [strike К, vol o (C) 


short digital option” [strike К , height K — Е]. We first choose f. as the average 


knockout time t AveKO> this maintains the boundary condition as S, — К, the 


ж 


terminal condition as 4, > t and the limit K — oo (boundary is removed). 


mat ? 
Step 3. Construct the no-skew BR portfolio, i.e. without skew for the vols, 
E,K ^, , ж. ж * * л 

А ET $8; (t КЛ (t )| where o, (ioa) = бү. 


Step 4. Subtract the skewed and unskewed BR portfolios in (2) and (3) above 
to get the first approximation for the skew correction, óC 120) . 

Step 5. Probability-weight SCL) py 2) [S 365); the probability" 
that knock out occurs at /^ , and then integrate over f. In order to maintain the 
knockout boundary condition, a subtraction” 5 eu first has to be made to 
SCE), “renormalizing” it to СІК) = GC(^) — Cl ^5). The skew 
correction including knock-out probability weighting is then of the form 


5CEX) = [Ze (Sie ey a (6.18) 


FinalSkewCorrection 


The result for the barrier option including skew is 


Os mt отаси (6.19) 


academic FinalSkewCorrection 


27 Common Digital Option Approximation: The digital option can be approximated by 
a call spread (the difference of call options) having strikes К, K+ 5K with 5K a small 
amount. Strictly speaking, the skew for these two options should also be included. This 
approximation is the way digital options are actually hedged. 


?* Smearing out the Skew Correction with the Knockout Probability: We use the 
academic formula for the knockout probability. We need to renormalize this probability 
to one. This is because we are merely smearing out the skew correction derived with the 
average knockout time. Note that then the replicating portfolio theoretically becomes an 
infinite number of options. 


29 The Subtraction: The subtraction SCsa E has the same form as 5С) with the 
current stock price replaced by К and the nominal volatility oo(t*) replaced by ox(t*). 
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An alternative multiplicative prescription for skew exists". The idea is to 


scale the skew using a reasonable base, e.g. the BR portfolio as defined in Step 3, 
pu [Step(3)]. Define a “skew factor”? 7, ,,, including knockout 


BareRep 


probability weighting as 


* 
mat 


F eiti = | г) Ge г > б, ) LIP HOT [Step(3)| dr (6. 20) 


0 


This produces the multiplicative prescription for the skew perturbation 


UO Call 
Cs. ) = Ca [1 + JF atosfüaar ] (6.21) 


Appendix 2: A Technical Issue for Stochastic Volatility 


This appendix has some comments on large volatility contributions for the 
stochastic volatility model in the text, which we believe present a technical issue 
for stochastic volatility models in general. 


If we let the limits of integration (с, m +A,) for the volatility 


Avg o 


become с є (0, о) we can get a closed form solution for С in terms of an 


infinite series of modified Bessel functions of the third kind by expanding the 
exponential containing terms linear ino and using the identity (for any v with 


Re(a),Re(b) positive) * 


v/2 


jo" exp( S -0° Jao - B K, (2Vab | (6.22) 


0 с 


However, interchange of the multiple 1 ас Í dx integrals is problematic if 


the volatility becomes infinite (even if large volatilities are suppressed), because 
the Gaussian spatial damping for large X' is not uniform. It makes a difference 
whether we let х — oo for fixed с or we let с — oo at fixed x, or we take a 
scaled limit as both variables become infinite. The non-uniformity in the two 
variables х and o just described would seem to pose a problem for stochastic 
volatility models that do not have some lattice cutoff for the volatility. 


3 Acknowledgement: I thank Cal Johnson for suggesting the multiplicative idea, for 
discussions on Derman’s work, and for many other conversations. 
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7. Forward Curves (Tech. Index 4/10) 


In this chapter, we discuss the construction of the forward-rate curves needed 
especially for pricing interest-rate derivatives. We begin with a discussion of the 
input rates to the forward curve construction models, and then discuss the 
mechanics. 


Market Input Rates 


Fixed-income securities pricing and hedging rely on forward-rate curves. We 
now consider the input market rates for the construction of the forward-rate curve 
in a generic derivative system. The input data needed to construct the curve are 
described below. The input data are typically obtained either from vendor sources 
or from the desk in the case of a broker-dealer BD'. 

The choice of which input rates are used depends to some extent on the 
algorithm chosen to construct the yield curve. The yield curve is used to construct 
the set of discount factors employed for discounting future cash flows’. The input 
rates used also depend on the currency, since different instruments are available 
in different currencies). The rates include various types. For the US market, these 
are cash rates, ED futures, Libor swaps, and US Treasuries. 


' The Close of Business: Rates of course vary during the course of the trading day. 
Official pricing to be put into the books and records will generally use rates from the 
close of business (COB). It is best if these COB rates are saved automatically to a 
database so that later analysis is facilitated. 


Future and Forward Jargon: Do not mix up the word “future” in a phrase like “future 
cash flow" (cash to be paid some time hence) with *a future" (a contract to buy/sell 
something at a fixed price at some date hence). Also, do not mix up the word “forward” 
in a phrase like “forward time period” with “а forward”, which is “а future" modified by 
a small “convexity” correction, as we shall discuss later. 


> Yield Curves for Different Currencies: The main currencies for the largest swap 
markets and their notations are USD (SUS), JPY (Japanese Yen), EUR (Euro), and GBP 
(Great Britain Pound) or STG (Sterling). Financial futures and swaps are available at 
different maturities for these currencies. The use of /iquid instruments is in principle 
important for reliable pricing. For other currencies including emerging markets there is 
less choice, and there may only be a few instruments available to construct the yield 
curve, liquid or not. 
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More recently due to Libor rate increases in the 2008 crash, “OIS” rates have 
replaced Libor for the closest market approximation to a “risk free” discounting 
4i 
rate”. 


Cash Rates 


Cash rates are short-term “money-market” rates", usually involving the cash 
Libor interest rates? paid for deposit times of various lengths out to 1 year for 
deposits in London, e.g. US Libor (USD “Eurodollar” deposits), JPY Libor (JPY 
“Euroyen” deposits) etc. 


Futures 


For USD, the Eurodollar (ED) futures, based on 3-month ED deposits, are used". 
ED futures are cash-settled (1.e. no delivery of a security as for treasury futures). 


The price of a future P, is between 0 and 100, and corresponds to a future rate 
r, —100 — P,. The ED futures have a notional of $1MM and correspond to a 


3M forward time period’, so a lbp change in rates’ corresponds to a price 
change" $dP, = $25. This is the change in the interest for a $1MM deposit over 


4 year’, if rates were to change by 0.0001/yr. Using Excel spreadsheet notation’? 


^ OIS and Libor: The Overnight Index Swap rate is an inter-bank fixed-float swap rate 
set by a central bank with no exchange of principal and little counterparty risk; the 
floating leg of an OIS is the Fed Funds rate in the U.S. and Eonia for euros etc. Libor is a 
rate for inter-bank loans for which actual cash changes hands, and so has counterparty 
risk. The Libor-OIS spread is generally small in non-stressed markets, but hit high values 
during the 2008 crisis when banks had high risk. Discounting in models is now done with 
OIS rates to approximate a “risk-free rate". 


? Libor, Hibor ete: The acronym Libor stands for “London Inter Bank Offer Rate". Libor 
rates in different currencies have different names (e.g. HIBOR for Hong Kong dollar). 


* Quantity and Time Notation: Abbreviations for quantity [MM = million; B = billion; 
M, К = thousand]. Abbreviations for time [Y or yr = year; M = month; D = day]. 


7 Basis Points, Units and Some Advice: Small changes in rates are measured in “basis 
points" or bp. Numerically one bp — 0.0001. Actually there is an implicit time unit, since 
interest rates are usually quoted in amounts paid per year (e.g. r = 4%/yr) so the relevant 
“one basis point" change in an annual rate r would be dr = 1bp/yr = 0.0001/yr. Usually 
the symbol bp is used without the units. While it sounds trivial, dropping the units easily 
leads to confusion. The potential quant is highly advised to put all units in all equations, 
at least for himself/herself. 


* Sensitivities of Futures in other Currencies: The price changes for futures for 1 bp/yr 
change in rates for other currencies have other conventions, but they are all determined 
by the same sort of equation as presented in the text. 
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we have $dP, = $10^ 6* 0.25 yr *0.0001/ yr = $25 per bp change in rate. The 


value of the future increases if rates decrease. A given ED future will have its 
value determined at a definite IMM fixing or reset date''. 


Libor Swaps 


A Libor swap is an exchange of fixed payments determined by a “fixed rate" for 
floating payments based on US Libor values at various times in the future. A 
"swap rate" is that fixed rate which makes the swap value zero, or indifferent to 
the choice of fixed or floating payments. USD Libor swap rates are available with 
high precision. They serve as constraints, since swaps priced by the swap model 
using the model yield curve have to agree with the input swap rates. Historically 
the swap rates are expressed as spreads to treasuries”. We look at swap pricing in 
Ch. 8. The USD swap rates based on Libor are now the most liquid instruments 
available in general, even more than treasuries, and the swap rates are now 
tending to become considered as the fundamental rates to which other rates are 
compared. Swap rates in other currencies (e.g. Euro, GBP, JPY) provide similar 
constraints for input to pricing deals in those currencies. 


Treasuries 


For the USD curve, these are US-government notes and bonds. For other 
currencies, the available government securities are used. US treasury securities 


? Money Market Day-Count Conventions: The meaning of “one year" changes from 
currency to currency for these futures. For USD, there are considered to be 360 days/yr 
but for GBP there are 365 days/yr. These “money-market day count" conventions are not 
the same as the conventions for bonds. And you thought that it was stupid to have 
different units for the measurement of electricity! 


Arithmetic Notation: Finance people often denote multiplication by a star * and 
powers by a caret ^ used in Excel spreadsheets, so 4*4*4 — 4^3 — 64. 


П IMM Fixing/Reset dates and Settlement dates for Futures: The (IMM = 
International Monetary Market) fixing” or “reset” dates, at which the values of the ED 
futures are determined, are the 3" Mondays of March, June, September, and December in 
future years (abbreviated e.g. as MAR04 for March, 2004). Payment (settlement) is made 
2 days later, a tradition started by snail mail. This is a minor but annoying complication, 
since the risk of an instrument depending on a Libor future is zero after that rate is set, so 
rate uncertainty or diffusion only goes up to Monday, but an extra 2 days of discounting 
for the cash flow should be included up to Wednesday. Conventions are different for 
different currencies. Finally, there are ED futures other than 3M (e.g. 1M). 


'? Bid, Offer, Mid Swap Spreads: There is a bid and offer side to the swap spreads. The 
bid might be 40 bp/yr and the offer 44 bp/yr. This means that a potential fixed rate payer 
is ready to pay 40 bp/yr and a potential fixed rate receiver wants to receive 44 bp/yr 
(above the corresponding treasury rate). The “mid” swap spread is the average (42 bp/yr). 
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are auctioned from time to time. The last-auctioned (*on-the-run") securities are 
those that are the most liquid. For the Libor curve, treasuries are really only used 
for representing the swap rates as swap spreads to treasuries. 

Treasuries are used directly to construct the Constant Maturity Treasury 
(CMT) curve that is used to price CMT deals. The CMT curve takes account of 
the various yields and results from a model-dependent interpolation" to get a 
nominal value for a treasury yield" at any maturity. Note that of course securities 
do not have their original time to maturity since they were sold so the “10 year 
treasury” will be plotted at somewhat less than 10 years". 


Construction of the Forward-Rate Curve 


We now discuss the construction of the US Libor forward-rate curve given the 
input rates discussed above. Actually, the interest-rate curve can be expressed in 
a number of different conventions. 


The forward rate f т) (GAP ) is the interest rate we obtain in a contract 
made today ¢, for a hypothetical deposit starting at time T and ending at time 
T + АТ. Although we could define the forward rate for any Т, in practice the 
forward rates are only modeled at a given discrete set of starting times ir , and 
with a fixed АТ. The set of rates thus obtained is the forward-rate curve used in 


pricing. 
A summary of the kinematics 1s pictured next: 


13 Off the Run Govies and CMT: “Off-the-run” securities are securities from previous 
auctions. They may be included in the algorithm giving the CMT rates through 
interpolations. However, off-the-run securities are less liquid because many bonds have 
already been placed in portfolios where they are held to term. Therefore, these get less 
weight in the fit. By the way, government securities are known as “Govies”. 


" Sliding Down the Yield Curve: Because securities that have already been issued do 
not have their original time to maturity, the jargon is that the securities have “slid down 
the yield curve” from the point where they were issued. The picture is that the yield curve 
is positively sloped (yields for bonds of longer maturity are greater than bonds of shorter 
maturity), which is usually — but by no means always - the case. 
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Forward Rate Kinematics 


f (KAT) 


The notation is as follows: 
e Today's forward rate curve is the set of f”? (t,;AT) for different Т, 


e t, = "Today", the date of the input data and also the contract date for the 
hypothetical (T IT +AT ) deposit, 


e Т = Start of the deposit time period 
T + AT = End of the deposit time period 


General Forward Rates 

A generalization" of the forward rate specifies a future time ¢ with t, « t « T , at 
which time we determine what interest we will get paid for a deposit starting at 
T and ending at T - AT, still using the input data from f, today'*!°. This 
interest rate we can call f EHE t AT ). We need this generalization if we 


simulate the forward rate moving in time ¢, always corresponding to a deposit 
from T to T +AT , starting from today’s forward rates determined by today's 
data". The set of such forward rates at time ¢ for different T forms the forward- 
rate curve at time ¢. The kinematics are illustrated below: 


5 Forward-Rate Example: For example, take a 3-month deposit of money from 
T 2 3/1/04 to T + AT = 6/1/04 . We may want to know what rate we can expect to 
contract as of t = 9/1/03 for this deposit, using input data from today tọ = 6/1/03. We will 
wind up with a set of the possible values of this particular rate, for example using a 
Monte Carlo simulator of the diffusion process for the forward rate. 


1^ T can be a calendar date or signifying an interval т after t. 


11 HJM: The most complete formulation of this idea is due to Heath, Jarrow, and Morton 
(HJM). See the references. 
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General Forward Rate Kinematics 


The Forward Rates and ED Futures 


We need today’s forward rate curve { f (0 (t; AT )} for different maturities 


{T k; In practice'’, we can choose the type of forward rate AT = ЗМ . This will 
allow us to use data from today’s ED futures, which are 3-month rates, for 
constraints. Say T is one of the IMM fixing dates /,,,,. Then the forward and 


ED future deposit start-times coincide, so the rates must (one would think) 
coincide. We get the reasonable-looking constraint: 


ga [is > 3M) = dan) ce (to) (7. 1) 


The Forward vs. Future Correction 


The above equation is not quite correct. There is a difference between the ED 
future rate and the forward rate even when the deposit start times 


T =t,,,, coincide". This correction can be modeled, and it typically turns out to 
IMM yp y 


5 Meaning of 3M: ЗМ means 3 months. With the US money-market convention that one 
year has 360 days, 3M means 90 days. 


? Futures and Margin Account Fluctuations: Consider a buyer X of ED futures. He 
must have a *margin account" that must maintain sufficient funds covering changes in the 
value of the futures. Every day, this account is *marked to market", meaning that funds 
can be withdrawn by X from the account if the future price goes up (rates down), and 
funds must be added by X to the account if the future price goes down (rates up). This 
changes the economics of buying futures. The fluctuations in the margin account thus 
clearly increase with increased rate fluctuations (called rate volatility). The buyer of a 
forward rate contract does not have a margin account. 
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behave roughly as the square of the time difference (T =t | The correction is 


added to the future’s price or subtracted from the future rate to get the forward 
rate. For example, the mean-reverting Gaussian rate model produces "^? 


B —-e(T-ty 2 
FO usus dou Gee a ie (72) 


The correction can also be backed out numerically from implementing overall 
consistency of the forward rate curve in the futures region with other data?^'. The 
results are roughly consistent with the model for appropriate choices of o,a. 


Use of Swaps in Generating the Curve 


In general, a swap-pricing model is used along with the determination of the 
forward rates (we discuss the specifics of swap pricing below). Each fixed rate 
go) for the corresponding model par swap with zero value $, — 0 (obtained 


(йан) . The forward 


a 


from a putative set of forward rates) is compared with data E 


(data ) 


rates are than chosen such that each g0 = E, ', up to some small 


a 


numerical error. 


Cash Rates and the Front Part of the Curve 

Cash determines the initial part of the forward rate curve. The forward rate for a 
deposit starting today and ending in 3M is just defined as the 3M cash rate; this 
starts the forward rate curve. There are some consistency issues”. 


20 Theoretical Calculation of the Forward vs. Futures Correction: The correction was 
calculated in my 1989 path integral П paper (Ref). It arises from straightforward 
integration. The “classical path" rate r^^ about which rate fluctuations occur is the ED 
future rate in the instantaneous limit AT = = —> 0. For small mean reversion o, the 
difference between the future and forward rate is quadratic in T- tp. The parameter o is 
the Gaussian rate volatility (diffusion constant). See Ch. 43 on path integrals. 


? Numerical Value of the Futures vs. Forward Correction: Numerically the 
corrections are roughly (“ball-park” as they say) < 1 bp for the front contracts, but 
become substantial - on the order of 10 bp at for contracts settling in 5 years. Since the 
bid-offer spreads of plain vanilla swaps are only a few basis points, these corrections are 
significant. 


? Use of Cash Rates in Generating the Curve: The cash rates other than the 3M rate 
can also be used as inputs for generating the forward rates, especially for short-term 
effects (e.g. 1 week). Generally, the cash rates do not influence the curve further out than 
a few months much since there is a good overall consistency in the market between cash 


88 Quantitative Finance and Risk Management 


The Break or Changeover Point 


In practice, a famous problem occurs at a break point or changeover point. Below 
this point, the forward rates are obtained with futures (with the correction) and 
above this point, swaps are used”. The forward rates tend to be quite 
discontinuous for T around this point. Other discontinuities can also occur. 
These discontinuities can lead to substantial effects in pricing some derivatives. 


Curve-Smoothing Algorithms 


Smoothing algorithms can be used to smooth out discontinuities in the curve 
generated by the above procedure”. Philosophically it is unclear whether the 
curve really should be made as smooth as possible. The curve acts as if it has a 
stored “potential energy". In practice, if you make the curve drop in one place it 
tends to bulge up in another place, somewhat like sitting on a large balloon. The 
smoothing algorithms can also result in smooth but noticeably large oscillations. 
These oscillations affect pricing of those securities that are not in the constraint 
set (1.e., pricing still occurs with all the given constraints realized). 


Example of a Forward Curve 
Here is an illustrative picture. 


and futures. One chronic nuisance is that the combination of a 3M cash rate with a 3M 
forward rate starting in 3M should in principle just be the 6M cash rate. Ask your friendly 
curve constructor quant if he really gets that one right. 


2 Specification of the Changeover or Break Point Between Futures and Swaps: The 
choice of where this changeover of the use of the futures and the swaps depends on the 
algorithm. It can also depend on the traders who may change it around. Usually it is 
between 2 and 4 years for USD. For other currencies that have fewer futures, the 
changeover point is earlier. 


? How to Be Smooth: Cubic splines are a popular choice for smoothing a curve with 
discontinuities generated by an algorithm. Another is to calculate independently the 
forward-rate curves generated by futures and by swaps. These are then added together in 
an overlap region with weights that are chosen as some function of T. Other smoothing 
choices exist, some sophisticated and proprietary. 
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Smoothed Fwd Rates 


—— Forward Rates % = Smoothed Fwd Rates 
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The graph is of a forward 3M Libor curve without smoothing and with a simple 
smoothing algorithm”. The unsmoothed curve has regions of somewhat zigzag 
behavior. The smoothed curve has some oscillations”. 


Model Risk and Curve Construction 


It should be recognized that the construction of the forward-rate curve is not well 
posed in a mathematical sense. This is because there are an infinite number of 


forward rates { f т) (%;АТ )} for all {Т }, ог at least а large number at different 


25 Tilustration Only: The curve is meant to be illustrative and was produced by some 
software I wrote. The data are not current. The smoothing algorithm is a simple technique 
invented for the book. Your algorithm will no doubt be much better. See the next 
footnote. 


°° Which Curve is Better: Smoothed or Unsmoothed? Say for argument that you have 
this great whiz-bang smoothing technique but the rest of the world uses unsmoothed rates 
(or "inferior" smoothing techniques). Then you will be off the market — which is after all 
determined by the rest of the world - for some OTC products that are sensitive to the 
regions of differences of the smoothed and unsmoothed rates. Conversely, if the world 
smoothes the rates and you don't, you will also be off the market. Again, the smoothing 
algorithm, when used, is not unique. Have fun. 
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ir , but only a relatively small set of data points for constraints. Therefore 


different algorithms can and do result in different curves. 

Later in this book, we will thoroughly discuss model risk. The construction of 
the forward curve is a good example where different but perfectly sane arguments 
can lead to different results. The differences in turn affect pricing and hedging of 
derivatives differently; this 1s part of model risk. 


Rate Units and Conventions 


One of the annoying features of interest-rate products is the plethora of units or 
rate conventions. We need to know how to translate between these conventions. 
Perhaps a system uses one set of units in calculating, or a salesman wants to 
quote a result in a certain convention. 


Let 7, be the convention for a rate. This means there are assumed to be 
N 


2 Semiannual, 4 Quarterly, 12 Monthly), and the number n, of compounding 


days per year (e.g. 360, 365), a frequency f, of compounding (1 Annual, 


days 1 


periods in a year (numerically л, =f). For example, к —0.05/yr with 


convention SA365 means f, = 2, Nası = 365. If time differences are measured 


in calendar days, the convention is called “actual”. The relation between the rate 
conventions is given by the requirement that, independent of convention, the 
same physical interest is produced in one year with Dt, yr calendar days, viz 


n ny 
r-Dt r-Dt 
їч ==. = q4 а (7.3) 
f, "N iays 2 


The logic is as follows. First, assume Dt, „ = № 


initial amount $N , at the end of the year of № 


days | and take f, =1. For an 


days 1 days, by definition the 7 


convention produces interest $N. If the frequency f, 22, at the end of 


N days 1 /2 days the 7, convention produces interest 7$ /2. This interest is 


reinvested for the remaining half of the № days, resulting in compounding. 


days_1 


Similarly, at the end of NV. 


days 2 days, with f, 21 the 7, convention produces 


interest 7, $.N . Hence at the end of № the 7, convention produces interest 


days_1 


days_1 


r, : $N . Generalizing the logic gives the result. Equivalently, we have 
days 2 
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ff. 

Nay vd 

S "m f, -1 fi 2 (7.4) 
1 


Чауз_1 


For example, if the 7, convention is Q360 with f, =4,N = 360, we get 


days 2 
r, = 0.0490/ yr . This is ten basis points less than for the 7 convention. Such an 
amount is definitely significant, being larger than the bid-ask spread for many 


swaps in the market. 
See the footnote for another convention” called 30/360, common for bonds. 


Compounding Rates 

Compounding rates 1s related to rate conventions, but the emphasis is different. 
Instead of coming up with two representations of the same rate, we wish to 
generate a composite rate from two other rates. For example, suppose that we 
have two neighboring 3M forward rates. We can find an equivalent 6M forward 
rate that generates the same interest as applying the first 3M forward rate and 
reinvesting the interest in compounding with the second 3M forward rate. 


Suppose that r, = f (гч) (%;3М ) is the ЗМ forward rate starting at time f, 
and ending at t, =t, +3M, while r, =f (Tot) (4;:3M ) is the 3M forward 
rate starting at ¢, and ending at t, = t, - 3M . The equivalent 6M forward rate 
= (T=ta) (1,:6M ) starting at t, and ending at t, =t, +6М (using 


rab 


N 


= 360 for rates, with time differences “actual”) is given by 


repe) о 


Note that if we use Eq. (7.5) to get the rate difference бг» in terms of or, and 


days 


or, , and set or, = or, = 1 bp for risk, then the compound rate must be shifted 
by ôr, >1 bp to be consistent. 


27 The 30/360 Day Count: This assumes that there are 30 days per month and 360 days 
per year. So the number of days between two calendar dates apart by (Nyears Nmonths Naays) 
is calculated as: 360*Nyears + 30*N months + Nay, . Corrections are made as follows: If the 
first date is on the 31%, reset it to the 30". If the first date is Feb. 28 and it's not a leap 
year or if the first date is Feb. 29, reset it to the “30" of February”. If the second date is 
on the 31“, and first date is on the 30" or 31°, reset the second date to the 30". 
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Rate Interpolation 


Often we need to interpolate rates. A curve-generating algorithm may, for 
example, generate 3M forward rates at IMM dates. However, we need 3M rates 


at arbitrary times. Usually, simple linear interpolation suffices. So, if f, , is the 
IMM forward rate starting at £, ,, and f, is the IMM forward rate starting at t, 
then the ЗМ forward rate f. 


a 


starting at an intermediate time f, , « f£, <}, and 


reducing to the appropriate limits at the end points, is given by 


і, і | 
PES (ect | "n 
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8. Interest-Rate Swaps and Credit Default Swaps 
(Tech. Index 5/10) 


Interest Rate Swaps: Pricing and Risk 


We begin a detailed discussion of pricing and risk management of interest rate 
derivatives'”. We start with interest rate swaps'. The simplest version of an 
interest-rate swap is the exchange of fixed-rate interest payments for floating-rate 
interest payments. For example, corporation ABC might receive the fixed rate E 
and pay a floating rate. Here is a picture of a plain-vanilla swap’. 


Fixed vs. Floating Swap 


Wall Street Broker- 
«—————— | Dealer Swap Desk 


ABC Corporation 


' Acknowledgements to Traders, Brokers, Sales: I thank brokers at Eurobrokers and 
traders at FCMC for breaking me into real-world aspects of interest-rate derivatives. I 
also thank many traders at Citibank, Smith Barney, Salomon Smith Barney, and 
Citigroup through the various mergers, for helpful interactions. I thank sales people for 
informative conversations. Much of the presentation of the practical real-world Risk Lab 
part of the book comes from what I learned from these people. I also thank my colleagues 
at Bloomberg LP for helpful discussions, especially on CDS. 


? History: I got direct exposure to practical fixed-income derivatives at Eurobrokers, and 
as the Middle-Office risk manager for FCMC. This work was based on pricing and risk- 
management software that I designed and wrote (cf. Ch. 34 for a few details). 


? Swap Picture: Diagrams are used to clarify deals. With complicated transactions 
involving various counterparties and many swap legs, diagrams are essential. 
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The most common floating rate is Libor. Other rates are Prime and CMT, 
which we discuss in Ch. 10. 


OIS, Libor, Swaps, and All That 


As mentioned in the Ch. 7, since the 1“ edition of this book, OIS has come in 
since 2008 as the de-facto standard for risk-free discounting when collateral is 
present. Forward rates are still based on Libor". 

Let’s see what is going on physically. Libor (plus whatever spread is in the 
contract) times the swap notional is cash you have to pay in a receive-fixed Libor 
swap, by definition of the deal. The forwards are based on Libor. In order to get 
the cash, posting collateral is required". Given sufficient collateral you need only 
pay interest at the OIS risk-free rate to get the cash. Hence OIS is the rate used 
for discounting the cash. 

Complications include the type(s) of collateral, the currency of the collateral, 
how much collateral is required, etc. 

For simplicity, these issues will at first be ignored for swap pricing or 
hedging. The basics need to be examined before including complexities. 

Later we give two examples of the complexity of two interest rates in 
modeling: (1) Repo and Libor for Treasury bond option modeling (Ch. 16), and 
(2) Libor and credit spreads for convertible bond modeling (Ch. 14). 


Swaps and Risk Management 


Consider the diagram above. Why is this risk management for ABC? ABC 
Corporation may have issued fixed rate debt but prefers, for its own reasons (for 
example asset-liability matching), to pay floating rate debt. Therefore, ABC 
exchanges the interest rate payments. Alternatively, in the case that ABC issued 
floating rate debt but prefers to pay fixed, the swap would go the other way. 
From the point of view of the broker-dealer BD, the swap in the picture is a “pay- 
fixed” swap. The BD swap desk will hedge the swap in a manner that we will 
consider in some detail. 

Of course, ABC may not really be performing risk management. Maybe ABC 
thinks that rates will decrease, due to its own analysis. Now usually the forward 
curve is upward sloping, implying that the markets “expectation” is that rates will 
increase. Thus, ABC is “betting against the forward curve”. Sometimes this has 
worked in the past for corporations, and other times it has failed. 
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Swap Kinematics 


We next show a plain-vanilla USD fixed-float swap in a spreadsheet-like format, 
maybe similar to a system you might encounter, along with comments*. We begin 
with the deal definition. Again, these quantities give the deal *kinematics" that 


would be stored by the books and records of the firm. 


Deal Definition 
] ID ABC Corp. 0001 
2 Fixed Rate 6.35% 
3 Floating Rate 3 months 
4 Deal date 3/19/96 
5 Start date 3/27/96 
6 End Date 3/27/01 
7 Notional $100 MM 
8 Currency USD 
9  Notional Schedule? No 
10 Floating Rate Spread 0 
11 Floating rate type USD Libor 
12 Cancelable? No 
13 Payments in Arrears? Yes 


Comments: Deal Definition 

Deal identification that will be put into the database. 
Conventions for rate units will be specified (30/360, act/360 etc.) 
Time between successive specifications (“fixings”) of the floating rate 
Date the deal is being done. This is an old deal. 

First date specified in the contract for some sort of action. 

Last date specified in the contract. This deal has expired. 
Normalization for calculating cash flows. 

Payments in different currencies lead to "cross currency swaps". 
“No” means that the normalization is the same for each date. 
Means no additional interest paid above the floating rate. 

Other interest rates are possible but less common. 

If cancelable, options called "swaptions" are involved. 

Payments are made 3 months after floating rates are determined. 


© со чохлом н 


———- 
шо м - © 


Next, we give the input parameters for pricing the swap. These parameters 
are the forward-rate curve, where the forwards correspond to the particular 
floating rate on which the swap is based (3-month Libor is the most common). 


^ Systems Appearance: These range from actual spreadsheets to GUIs of whatever 
eclectic esthetics the systems designer chooses (naturally with feedback from the traders 
who have to use it). Probably the appearance would be much more attractive than what I 
use here for illustration. The comments would not appear; they are for the reader. 
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Normally this curve would be produced in another part of the system and read 
into the module used for pricing and hedging. Therefore, we have: 


Parameters 
] Forward Rate Curve 


Comments: Parameters 
1 See previous chapter for discussion 


Forward Rate Curve input to swap pricing 


The graph shows the forward rate curve used for the swap in this particular case 
along with the break-even (BE) rate of the swap. 


Forward Rates and Break-Even Rate 


—IiiÉ— Fwd Rates —&— BE Rate 


hhh hhh ho 


123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 


Swaplet number 


The BE rate is the average of all 20 rates with discounting included (see the 
section at the end for the math). The “swaplets” are the components of the swap 
and are discussed below. 


Pricing and Hedging for the swap 


We now give the representative pricing and overview of hedging results for this 
5 
swap’. 
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Pricing and Overview of Hedging 


1 Swap value $000 $40.5 
2 BESwap Rate 6.359% 
3 Delta total 1714.5 
4 Gamma total -87.2 
Comments: Pricing and Overview of Hedging 
1  Thisswap was practically "at the money" with very little value. 
2 Equivalent fixed rate that would have led to zero swap value. 
3 The initial hedge would be to buy 1714 ED futures contracts ($43K/bp). 
4 Gamma is expressed as 100 * change in number of contracts/bp. 


This swap is viewed from the point of view of the counterparty paying the 
fixed rate, at the deal date. Since the fixed rate (6.35 96) was slightly below the 
BE rate (6.359 %), the swap had a small positive value. A swap has a small 
convexity, i.e. second derivative with respect to rates producing gamma (y), due 
to the discount factors being nonlinear in the rates. However, gamma cannot be 
hedged with Eurodollar (ED) futures since ED futures have no convexity (a 
change in rates by lbp changes any ED future by $25 without any discounting 
correction). Therefore, gamma would need to be hedged with other instruments. 
Alternative reporting specifications of A and y in $US are also used, multiplying 
by $25/contract/bp. 


Looking Inside a Swap: Swaplets 


There are 20 swaplets in a 5-year swap corresponding to various dates set in the 
contract. Each swaplet has an associated forward rate and value. The swaplet 
delta and gamma risks are individually given in equivalent numbers of ED 
futures contracts. These result from the sensitivities to forward rates for which a 
given swaplet depends. Here is a picture of one swaplet inside a swap. 


? Where Did this Swap Example Come From? This swap is the same as in my first 
1996 CIFER tutorial. While rates have dropped dramatically since 1996, the principles 
are general and the results for current swaps will be similar. I wrote the swap pricer. 
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Diagram for one swaplet 


——————À 


< dt > 


The pricing and risk details of the swaplets are shown below. The biggest 
sensitivity is to the forward rate in the same line in the table as the swaplet, but 
there are other sensitivities due to the discount factors. 


Time forward 
from "now" 


Swaplet Fwd. Rate Start End $ Value $000 Delta 100 Gamma 
1 5.438] 3/27/06| 6/27/96 $ (230.00) 101 -4.7 
2 5.476| 60/27/96 9/27/96 $ (217.00) 100 -4.7 
3 5.667, 9/27/96] 12/27/96 $ (165.00) 97 -4.6 
4 5.898| 12/27/96| 3/27/97| $ (107.00) 95 -4.5 
5 6.058] 3/27/97| 6/27/97 $ (69.00) 95 -4.6 
6 6.162] 6/27/97| 9/29/97 $ (45.00) 96 -4.7 
7 6.277| 9/29/97| 12/29/97 $ (17.00) 91 -4.5 
8 6.402] 12/29/97| 3/27/98 $ 11.00 86 -4.3 
9 6.424| 3/27/98} 6/29/98 $ 17.00 91 -4.6 

10 6.464| 6/29/98} 9/28/98 $ 25.00 86 -4.4 
20 7.005] 12/27/00] 3/27/01 $ 118.00 70 -3.8 


Simple Scenario Analysis for a Swap 


Now we consider the change in the characteristics of a swap under a hypothetical 
simple scenario. The scenario is that all forward rates f e regardless of maturity 


are raised by 10 bp keeping time fixed, i.e. df, (7) 


Scenario 


=10bp . Sometimes rates can 
change suddenly by this magnitude, e.g. in less than 1 day. We could also 
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envision a time-dependent scenario, although the simplest and most common 
procedure is to separate out the time dependence by moving time forward while 
keeping rates fixed. The change in the forward rates (10 bp) is assumed equal for 
all forward rates—a “parallel shift”. Other scenarios, with the various forward 
rate changes taken as unequal, give the yield-curve shape risk. 

The revalued or reval (“new”) results are shown below for the value of the 
swap and the Greeks along with the initial (“old”) results, and the changes. 


Scenario Reval New Old Change 
1. Swap value $000 $468.1 $40.5 $427.5 
2. Delta Hedge (contracts) 1,705.8 1,714.5 (8.7) 


3. Gamma (100 * Change in # Contracts)-86.9 -87.2 0.3 
4. Break-even Swap Rate 6.46% 6.36% 0.10% 


Comments 


1. Pay fixed swap makes money because the received floating rates increased 
2. Fewer contracts needed to hedge the swap 

3. Very little change in gamma 

4. Break-even rate for the swap after rates increase also goes up 10 bp 


The “Math Calc." for Price Changes Using the Greeks 


We now compare the approximate results using the Greeks to the exact results 
obtained through scenario revals. The delta contribution gives almost all the 
change in the value of a swap: 


2 
SAS pera = 2 ПРИЕТ И : : : d nc (8.1) 
bp · contract 


Inserting the old A = 1714.5 contracts gives $dS 


Delta 


— $428.6K . The convexity 


due to gamma is negative because the discount factors decrease with increasing 
rates. We have 


$45 | $25 E [4 (105р) li (8.2) 


Gamma — Y dcontracts / bp Scenario 
bp: contract 2 


Inserting у = —0.87 change in contracts per bp gives the small result 
$dS ~ —$1.1K . The sum gives the “Math Calc." change $5 


Gamma MathCale ? 


$45, 


а 


thCalc = SAS pera SE $45 (8.3) 


Сатта 
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The result for $dS 


T is equal to the scenario revaluation difference to this 
'athCalc 


accuracy, $45 = $427.5K . Actually, this result is “too accurate" because a 


Scenario 
real-world hedge would just involve an even multiple of 100, i.e. 1700 contracts. 
Now we need to check the change in delta dA due to gamma. We 


contracts 


should have to a good approximation 


105p) 


= ( 
dA pss = y dcontracts / bp ` df. Scenario (8.4) 


Plugging in gamma as у = —0.87 gives the change in the number of contracts as 


= —8.7 , in agreement with the “reval” result. 


contracts 


Interest Rate Swaps: Pricing and Risk Details 


In this section, we follow up the introduction to swaps with some more detail. 
Consider a broker-dealer swap desk BD that transacts a new swap with a 
customer ABC. Typically, the new swap risk to BD will be hedged immediately. 
This new deal also may be considered in the context of a portfolio of deals 
already held by BD. The actual hedge put on by BD for the new deal may depend 
on what happens to be in its book, as well as the risk appetite and view (if any) of 
BD regarding the market, perhaps leading to a specific strategy. Regardless, no 
hedging strategy will be exact, and residual risks will remain under any 
circumstance. These residual risks will then go into the portfolio risk and change 
the overall risk of the desk*. 

In order to be concrete, we consider the hedging of an isolated swap in USD 
for fixed payments vs. floating payments’. 


Hedges 
The following instruments are available as possible hedges: 


e Short-term money-market cash instruments 
e  Eurodollar (ED) futures 


* Limits for Risk: The desk risk may be controlled by limits set by an independent risk 
management group. This involves negotiations and periodic reviews. See Ch. 40. 


7 Basis Swaps: There are also swaps with floating payments in one rate vs. floating 
payments in another rate. These are called *basis swaps". The hedging then depends on 
the correlations between the two floating rates. For example, we can have a 3-month vs. 
6-month Libor basis swap. A "basis" in general needs to exist to reproduce the market 
value of the basis swap; this basis is added to the forwards of one of the rates. 
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FRAs 

Other swaps 
Treasuries 
Treasury futures 


The swap can be divided up into constituent swaplets, as we have seen. These 
swaplets have individually different cash flow specifications with definite rules 
for determining the reset values of the rate on which the swap is based, the cash 
flow dates’, etc. according to the contract" between BD and ABC’. The following 
complications theoretically enter into the hedging of the swap. 


Hedging with Euro Dollar (ED) Futures 


The cash flows originating in the forward time period where the ED futures are 
relatively liquid can be hedged with ED futures". In general, the cash flows for 
the swap will occur at various times that will not coincide with the ED future 
IMM dates''. The following picture should clarify things. 


* Payments in Arrears: The various swap cash-flow dates are generally “in arrears”, 
which means that a cash payment is made some time after the rate determining that cash 
payment is made (or “reset” or fixed"). An extra discount factor for the extra time period 
from the reset time to the cash-flow time is included. If a payment is not “in arrears”, the 
payment is said to be “up front”. 


? The All-Important “Term Sheet” and Why You Need It: The legal contract for the 
deal will be preceded (before the deal is transacted) by a “term sheet” that contains the 
details of the proposed transaction. These term sheets may or may not be generic, and 
they can change up to the last minute depending on the transaction dynamics between the 
desk BD and the customer ABC. In early discussions, the term sheet may only be 
schematic. The quant should always get a term sheet — hopefully the latest one, even if 
indicative or provisional, and even if the salesman argues that he doesn’t have it yet - 
before wasting time modeling, pricing, and hedging using wrong assumptions regarding 
the nature of the deal. Having said this, a fair amount of back-and-forth quantitative 
analysis often occurs for a given deal if there are non-generic aspects. 


10 Hedging with ED Futures: The liquidity of the ED futures is greatest for times within 
the front few years, and the implementation of ED future transaction is more difficult 
further out in time. In addition, futures are generally transacted in groups of 100 
(anything smaller is called an “odd lot” and is inconvenient to transact). Often an ED 
hedge will be transacted first quickly using the front few contracts for the whole swap 
risk and then some time later the hedge will be distributed among hedge instruments with 
other transactions. Various residual risks corresponding to the inexactness of the hedging 
for each of the points raised above will occur. 


П IMM Swaps: Some swaps, generally short term swaps, do have fixing dates at the ED 
future IMM dates (these are imaginatively called IMM swaps). IMM swaps are very 
quickly transacted and are popular with short-term swap desks. 
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Swaplet reset times located between IMM dates 


Reset dates for a typical swap € 3M 2 


IMM dates for ED futures € 3M ? 


An interpolation of the risks in time then needs to be performed. In general, 
this is just done proportionately to the time intervals from a given swaplet cash 
flow to the IMM dates before and after that cash flow. 

For a mneumonic, see the footnote". 


Hedging with Cash Instruments 


The swap may have an uncertain cash flow before the first Eurodollar future 
IMM date. This leads to "spot risk", and can be hedged with the short-term 
money market cash instruments. Sometimes this is not done, and the spot risk 
will be approximately hedged with the nearest ED future". 


7? ED Futures and Swaps Hedging Mnemonic: Put your hands together with 
interlocking fingers. Your right hand intersects at the swap-reset times and your left hand 
intersects at the IMM dates. You need to interpolate the risks of each of the swap-reset 
dates of each right-hand finger with the neighboring two futures of the neighboring 
fingers on your left hand. 


‘Initial Cash flow in the Swap: New swaps can also have an initial cash flow that is set 
in the contract. A swap already in the books may have a cash flow that has been reset but 
not yet paid. There is a risk due to the uncertainty of the discounting from the analysis 
date until the payment date for the entire cash flow. However, if the swap is sold, the cash 
to be paid from the reset to the transaction date is sometimes set on an “accrual” basis not 
including discounting. 

" The End-Year Effect: There is an “end-year” effect that happens at the end of the year 
which leads to anomalies in the money market and which needs to be hedged separately. 
For the year 2000, this effect was very large. 
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Hedging with FRAs 

An FRA" (forward rate agreement) is the same as a one-period swap, i.e. a 
swaplet'®. FRAs have an advantage in that the periods over which they exist are 
measured in months from today, not at IMM dates. For this reason the dates can 
be tailored to match those of a new swap. 


Hedging with Other Swaps 


If the ABC swap happens to be a swap depending on other rates'^ (e.g. CP, 
Prime, Muni, CMT etc.) then a Libor swap can be transacted in the opposite 
direction. The residual risks here involve “basis risk” due to the difference in the 
behavior of the rate from Libor. Date or time risk also occurs since the cash flow 
dates of the two swaps will generally not be identical. There is also normalization 
risk since the ABC swap notional or principal" may not match the hedge swap 
principal. See the ISDA'*™' reference” for data on notionals'’. If the ABC swap 


'S FRAs: FRA = Forward Rate Agreement. Either say the letters “F.R.A.” or say the 
acronym "fra". FRAs are short-term money market instruments (see Stigum, Ref.). The 
floating side of an FRA is a forward rate starting at t = p months and ending at T = q 
months. FRAs are represented as pxq (e.g. 1x4, 3x6, 6x12). For 3x6, say “Threes Sixes”. 
FRAs are cash settled proportional to the difference of the floating rate and the contract 
rate. There is a consistency between the FRA market and the futures market. 


^ Other Rates: Examples include СР = Commercial Paper, Prime = Prime rate set by 
banks, CMT = Constant Maturity Treasury, Muni = A municipal index rate. These 
“basis” rates have their own swap rates often expressed as a “spread” or difference with 
respect to Libor swap rates (except for Muni rates which use ratios with respect to Libor 
rates depending on tax considerations). Some of these swaps (e.g. CP) involve 
complicated rules regarding averaging the rates over various date periods. 


17 Notional and Principal: “Notional” and “principal” mean the same thing. The notional 
is a normalization factor that multiplies the overall expression for the swaplets 
comprising the swap. Sometimes the swap market is characterized by the total notional in 
all transactions. 


'S ISDA: This is the International Swaps and Derivatives Association, Inc. ISDA 
describes itself as the global trade association representing leading participants in the 
privately negotiated [fixed income] derivatives industry. ISDA was chartered in 1985 and 
today has over 550 members around the world. ISDA Master Agreements are always 
used for contracts for plain-vanilla deals (see the Documentation Euromoney Book Ref). 


The Total Interest Rate Swap Notional and Why You Don't Care: The 2001 figure 
for the total outstanding notional for interest rate swaps was over $50T (Trillion) 
according to /SDA (ref), and has increased substantially. “Outstanding” means all deals 
already done that have not expired or closed. This initially somewhat scary number has 
very little to do with the real risk in swaps which is many orders of magnitude less. For 
example we saw above that a $100MM notional swap has a risk of around $40K per bp 
change of rates to the BD if the swap is unhedged. The BD will generally hedge this risk 
down to a very small fraction of $40K. Finally, swaps on the other side (receive fixed vs. 
pay fixed) will behave in the opposite fashion. 
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is an amortizing swap, (the principal is different for the different swaplet cash 
flows), further complications exist”. 


Hedging with Treasuries 


U.S. treasuries (which are very liquid) can also be used as hedges. Again there is 
basis risk since treasury rates are generally not the same as the rate on which the 
swap is based, date risk corresponding to mismatched cash flows, normalization 
risk, etc. An exception is for CMT swaps, which have a natural hedge in 
treasuries. The market for “repo” becomes involved 7!” 


Hedging with Treasury Futures 


Treasury futures involve complications, including the “delivery option” for the 
“cheapest to deliver””’. The inclusion into risk systems is standard. 


Cross currency swaps involving exchange of principal amounts (not just interest 
payments) are a different story. 


? Notional or Principal Schedules for Amortizing Swaps: Amortizing swaps are often 
specified by the customer ABC according to a “schedule” of the different notionals for 
the different cash flows, corresponding to specific ABC needs. For example, the first cash 
flow could have $100MM notional and the second cash flow $90MM notional, etc. 
Naturally, ABC will have to pay an extra amount to BD for such custom treatment, but 
part of this will be eaten up by the residual risk forced on the BD. 


2 Repo: The repo market is an art unto itself. The simplest version is overnight repo, 
where a dealer sells securities to an investor who has a temporary surplus of cash, 
agreeing to buy them back the next day. This amounts to the BD paying an interest rate 
(repo) to finance the securities. The arrangement can be made such that the BD keeps the 
coupon accrual. The complexities of repo (including "specials") enter the hedging 
considerations. We give an illustrative example of a repo deal in Ch. 16. 


? Acknowledgement: I thank Ed Watson for discussions on repo and many other topics. 


Bond Futures and the CTD: A bond future (as opposed to ED futures) requires 
delivery of a bond to the holder of the bond future from a party that is short the bond 
future. However, this does not mean delivery of a definite bond, but rather of any bond 
chosen at liberty from a set of bonds specified for that particular bond future. One of 
these possible deliverable bonds (the “cheapest to deliver or CTD”) is economically the 
best choice for the holder of the short position, who gets to choose. The CTD bond (with 
today’s rates) thus determines the characteristics of the bond future today. However, since 
the future behavior of rates is uncertain, another different bond may wind up being the 
cheapest to deliver when it actually becomes time to deliver a bond. This uncertainty 
shows up as a correction to the bond future’s price and its sensitivity to interest rates. 
Other complexities exist for bond futures. 
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Example of Swap Hedging 
We already considered the basic idea in a previous section. Again consider the 
above 5-year swap, non-amortizing with $100MM notional, based on 3-month 
Libor with payments in arrears, where the BD pays a fixed rate to the customer 
ABC. The risk to BD from the ABC swap is that rates decrease. This is because 
then BD would then receive less money from ABC determined by the decreased 
floating Libor rate that ABC pays to BD. Therefore the pay-fixed swap of the BD 
must be hedged with instruments that increase in value as rates decrease. These 
include for example buying bonds or buying ED contracts or both”. 

As we saw, the total delta or DV01*° from a swap model for this swap is 
short 1714 equivalent ED contracts’”**. This means that if ED futures were to 


24 “Short and Long the Market” for Swaps: The pay-fixed swap makes the BD “short 
the market" and the pay-fixed swap loses money as rates decrease. In order to hedge the 
pay-fixed swap, the BD must go “long the market", buying instruments that make money 
as rates decrease. Generally, going long the market is in the same direction as going long 
(buying) bonds, whose prices increase as rates drop. 


? More Swap Jargon: Say that the bid swap spread is 40 bp/yr and the offer 44 bp/yr. 
This means that a potential fixed rate payer is ready to pay 40 bp/yr and a potential fixed 
rate receiver wants to receive 44 bp/yr (above the corresponding treasury rate). A “bid 
side swap" for a BD means that the BD pays the fixed rate. Since the BD in that case 
receives floating rate payments, which increase when the swap spread increases for fixed 
treasury rate, the BD who pays fixed is said to be “long the spread” and “long the swap". 


°° DV01 Warning: Careful. Some desks use the convention that DVO1 or delta 
corresponds to a rate move up, and some desks use the convention that DVO1 
corresponds to a rate move down. This complicates aggregation of risk between desks. 
The story gets worse. See the following footnote. 


27 More on Different DV01 Conventions and Risk Aggregation Problems: Other 
conventions for expressing delta or DVO01 exist. One convention uses the notional for a 
given bond or swap (e.g. 10 year). That is, the risk is expressed in terms of the number of 
10-year treasuries it would take to hedge the overall risk. Sometimes the risk is expressed 
in terms of zero-coupon rate movements. Sometimes the risk is expressed in terms of a 
mathematical rate present in a model which is used to generate the dynamics, but which 
actually has no physical meaning. The difference between the reported DVO1 using these 
different conventions can easily be on the order of a few percent. Finally, sometimes the 
magnitude of the rate move will be different between desks (e.g. 1 bp, 10 bp, 50 bp) and 
thus may include some convexity. This is often done as a compromise involving 
numerical stability issues of the models dealing with options that are also in the portfolio. 
These conventions will sometimes but not always be marked on the desk risk reports. All 
this can generate confusion when aggregating risk. Similar anarchy unfortunately exists 
for other risks as well. Sometimes this whole problem is ignored, or goes unrecognized. 


?* More Handy ED Futures Jargon: A “strip” is a set of four neighboring ED futures 
contracts corresponding to one year. These strips are given names. The “Front Four" are 
first contracts (EDI — ED4). The “Reds” are the next four (ED5 — EDS), then the 
“Greens”, the “Blues”, the “Golds”, etc. Moreover, the months corresponding to 
expiration are abbreviated (e.g. December is called DEC pronounced *Deece"). The 
names are convenient shorthand for the traders. 
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constitute the total hedge for the swap, the BD would have to buy (“go long”) 
1714 ED futures. For a one basis point (1/100 of 1%) decrease in Libor, one ED 
future increases in value by $25, so the risk to BD for the ABC swap hedge is a 
loss of 1714*$25 or around $43,000 for each bp of overall (parallel) rate 
increase”. There is also a small correction from “gamma” terms”. 


The “Delta Ladder”, or Bucketing: Breaking Up the Hedge 
A graph of the first 3 years of the hedge needed for the swap is shown below. 


Hedge for 3 yrs of 5-yr swap (Equivalent ED contracts) 
—e Delta 
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? The Movements of Rates is Roughly Parallel: This risk is expressed in the simplest 
case for all forward rates determining the swap increasing together by one bp. While 
different forward rates by no means move by the same amount in the real world, 
nonetheless in practice, the approximation of this “parallel shift” covers much of the risk 
and is the measure commonly used for a first-order approximation. The extent to which 
the forward rates do not move in parallel will present risks insofar as the individual 
swaplet deltas are not individually hedged. One way to look at rate movements is using 
principal components, which we treat in Ch. 45 and in Ch 48 (Appendix B). 


? Gamma for a Swap: A Libor swap has small gamma (second derivative) risk, so most 
of the risk is just due to the first derivative or delta (or DVO1). The gamma risk is 
expressed by specifying the change in the number of equivalent contracts per bp increase 
in rates. For this swap, this number is about -0.9 for the entire swap. As we shall see 
below, the details of bucketing gamma in forward time is complicated since gamma is 
really a matrix of mixed second partial derivatives corresponding to the different forward 
rates comprising the swap. Thus, there is really no exact gamma ladder. However, an 
effective approximate gamma ladder will be constructed. 
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The “ladder” or set of “buckets” is a breakup of the total hedge in maturity steps. 
For example, we can use 3-month “buckets”. The units are equivalent ED future 
contracts. An approximate hedge for this front part of the swap could consist of 
buying 100 of the first three “strips”. That is, the approximate hedge for the first 
3 years could consist of buying 100 contracts of each of the first 12 contracts 
ED1—ED12. The “spot” risk up to the first contract would need to be hedged 
separately, as would the tail risk from 3—5 years of 561 equivalent contracts (see 
discussion below). Residual risks would be put into the portfolio. 


Hedging Long-Dated Swaps 


Hedging a long-dated swap is achieved by first choosing hedging instruments and 
then minimizing the risk. The risk can be broken up into buckets, but in general, 
it will not be possible to make the risk zero in every bucket unless the buckets are 
suitably chosen large enough. 

Consider, for example, an amortizing 15-year Libor swap. Aside from back- 
to-back offsetting of this swap with a similar 15-year swap, there is no single 
natural hedge. A general strategy is to employ a mixture of the hedging 
instruments. For illustrative purposes, we use five, ten, and thirty-year treasuries 
along with the first 2 years of ED futures. The buckets are thus 0-2 years, 2-5 
years, 5-10 years, and 10-30 years. 


Algorithm, Illustrated in the following figure 


(1 0-15 yr) 
Swap 


e The 10-15 year “tail” risk of the 15-year swap A is hedged by an 


appropriate number of 30-year treasuries. This is done by choosing the 10-30 
(10-30 yr) 


year treasury risk?! An dw = -А in) Thus, the tail risk of the swap 


past 10 years is hedged. The detailed risk is not hedged however; the risk 
report will show the total [swap + 30-year treasury hedge] having canceling 
risks between 10-15 years and 15-30 years. 


e The combined 5-10 year risk AC 


Swap +30-yrTsy Of the swap and 30-year 


(5-10 yr) — AL 


treasuries is hedged by 10-year treasuries with Aj). Tsy Swap + 30-yr Tsy ° 


?! How to Break up a Treasury to Implement the Hedging Back-chaining Algorithm: 
The bucketed risk of a treasury bond is determined in the same way as the bucketed risk 
of the fixed leg of a swap. The forward rates determine the discount factors that in turn 
determine the bond price. The discount factors are simply differentiated with respect to 
the forward rates. The amounts of the various treasuries in the hedges are simply obtained 
by determining the notional values corresponding to the integrated risks in the intervals 
needed and specified in the back-chaining procedure. 
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e The combined 2-5 year risk of the swap and the 10, 30-year treasury hedges 


Ads 30r Tsy 15 hedged by 5-year treasuries, AP = AG o, E 
e Тһе combined 0-2 year risk of the ABC swap and all treasury hedges 
(0-2) is hedged with ED futures a. = AR 30r Tsy * 


Swap + 5,10, 30-yr Tsy 


Hedging a Swap with Treasuries and ED Futures 


e—a 


Swap tail risk > 10 yrs: 
Hedge with 30-yr treasury 


Total 5-10 yr risk: Hedge 
with 10 yr treasury 


Total 0-2 yr risk: Hedge with 
first 8 ED futures 
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Changes in the Hedges with Time 

The total DVO1 or delta risk of the ABC swap plus the treasuries plus the ED 
futures is zero at the time the above back chaining algorithm is implemented 
(modulo odd lots or fractional residual effects). However all this was valid only 
at one time. As time progresses, changes to the hedges need to be made. These 
include events like these: 


e А payment 15 made or received for the swap. At this time, a corresponding 
hedge must be removed because the swap risk has changed. 

e An IMM date is past. Before this happens the trader will “roll over" the front 
contract which is about to expire with the second contract about to become 
the front contract. 


The hedge components that hedge the deal in the various buckets can be 
found using a computer minimization routine. The result amounts to a back- 
chaining algorithm. In practice, a simpler hedge could be executed (e.g. a rough 
hedge only with 10-year treasuries) and later adjusted to a complicated hedge. 


Swap Pricing Math and Risk 
Here is a summary of the math for an interest rate swap. The value $S of the 
swap “today” is given by the sum of swaplet values $5, , 


N N 


swap swap 


SS= > $5, = У, SN, (9 —B)- db cpap Phn (8.5) 
1-0 1-0 


Here 7 (7) is the forward rate of the type specified by the contract", for the 


given maturity (or fixing or reset) date. In this section we use a caret ^ to 
emphasize that the forward rates are determined by today's data. The fixed rate 


E may be defined to include a spread. The time interval dt, is specified by 


swap 


the contract (e.g. 3 months) as the time between successive 7,. The notionals or 


principals $N, are usually constant, but as noted above may not be constant 


depending on the client's requirements. If, as is common, the notionals $N, 
decrease with time, the swap is called an amortizing swap. 


Each discount factor pU) 


Arrears 


Is the zero-coupon bond that discounts back 
from the time that the cash flow occurs to today. Normally this cash flow is “in 


? More on Swap Specification Details: This includes rate specification, including the 
type (Libor etc.), the day-count convention (money market, 30/360 etc.), holidays when 
payments cannot be made, and a number of other items. 
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arrears”, a time di. p later than the maturity or reset date 7,. Note that pU) 


Arrears 


depends on all the forward Libor rates | y Dh up to the date 7, . 


The sum runs over all terms specified in the contract from the / = 0 term 
(which usually involves accrued interest) to the last term N For a single- 


swap * 


currency swap, the exchange of the notional is not done. 


The Swap Break-Even Rate (Par Swap Rate) 
The BE rate R,, (also called the Par Swap Rate) is given by setting the value of 


the swap to zero, and is the average of the forward rates with discounting 
included, viz 


N swop P 
У $N, ` p ` di, wap { г СА 
К» = 25 (8.6) 


swap 


УЗА, tay РЇ) 
1-0 


Arrears 


Given the BE rate, the value of the swap is determined. We define the 
N 


swap 


quantity $X — » SN dL 3208 , and we write? 
1-0 


$S =(R,, —E)-$X (8.7) 


In this form, it is clear that the value of the swap is (up to normalization) 
given by the difference between the break-even rate and the fixed rate. 


Stripping the Curve 


“Stripping the Curve” is the method used to generate the discount factors by 


recursion”, Call the set of current break-even (par-swap) rates { Ro for 


? Numeraire and Numéraire: $X is called the “numeraire” for the swap break-even 
rate. Effectively the numerator of the swap break-even rate is “measured” in units of the 
denominator $X. For more about numeraires, see Ch. 43. Je sais que je devrais écrire 
numéraire et non pas numeraire. Accent risk. Bad Joke Alert. 
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п= 1,2... Set $N,=$N and dt, = 41, independent of /, with no accrual in 


wap 
the / = 0 term. We note that a floating rate note FRN is at par at reset, and use 
Eq. (8.18) along with Eq. (8.6). After a little algebra we get a recursion relation 


for p т.) in terms of previous zero-coupon bonds, which is the desired result: 


Arrears 


2 Arrears ) 


(л) 
c Ra, dt 2. 


(7,) 
Pd ш 1+ R dt (8.8) 


swap 


Swap Risk, the Delta Ladder, and the Gamma Matrix 


In this section, we consider the swap risk in detail. In particular, we will consider 
different definitions of the delta ladder. We also provide a correct treatment of 
gamma using the gamma matrix and we will see why a simple gamma ladder 
(usually defined) is not viable. Our conclusions are more general than just for 
swaps; in particular, they apply to swaptions as well. 


Swap Risk 
In order to break down the swap risk, we need to consider the individual risk for 
each of the forward rates. The delta ladder is the collection {A,} whose exact 
definition will be considered below. Gamma is the derivative of delta. This 
means that gamma is really a matrix with elements у. since A, can be 
differentiated with the forward rate with /' = 1.) 

The change dS „сиг in the value of the swap using the delta ladder and the 
gamma matrix including the normalization $N,,, = $25/(bp - contract) is the 
usual Taylor series result to second order, 


Now yap Now, ap 


$dS MathCalc =$Nep 2, А; Of, + — 32 Fu Of Of, (8.9) 


2 rco 


Here, ó f, is the assumed change in f (7) Note for parallel shifts, i.e. a constant 


value df... for all ôf, , we get the expression we had previously, namely 
Scenario 1 


% Dual Curve Stripping: In the presence of OIS discounting and Libor forwards, this 
procedure becomes “dual curve stripping” and is more complex. The simplified treatment 
here is necessary to understand these more complicated issues. 
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1 2 
SAS шсш = $Nep |^ ` 7 scenario T 57 ' | ым ) | (8.10) 


N swap N swap 
Here, the total delta is A = У A, апа the total gamma is y = > Ур 


1=0 1,1'=0 


The Delta Ladder 


We now consider the delta ladder fA}. We will see that there are several 


possible definitions. First, the variation of the discount factors, while 
complicated, contributes as a rule of thumb only around 10% to delta. So to get 


an approximate but reasonable estimate of delta, you can ignore the discount 
factors. In this approximation, delta A, (in ED contracts) for the / forward rate 


is obtained by two steps: (1): Differentiate the swaplet $S, with respect to the 


appropriate forward rate Ô () for that term, and (2): Divide the result by the 
normalization $N „p = $25/(bp - contract). 


There are two ways that the variation of the discount factors (p А | can be 


Arrears 


included. We note first that the discount factor can be written in terms of the 
forward rates as 


1 


p gll (1 nh P dt, 


(8.11) 


The two methods of including the discount-factor variation are then: 


Method #1. The delta A, is assumed to contain the total variation of the whole 
swap price $S (i.e. all swaplets) with respect to the l" forward rate. Noting that 


the swaplet index has to be at least / in order to have a dependence on Ô 7) We 
get 


A (Method #1) _ 1 285 1 Wer OBS. 


D М (8.12) 
! $N ep of ™ $N fel of ™ 
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Method #2. The delta A, is assumed to contain the total variation of the 
individual swaplet price $S, with respect to all forward rates. The forward rate 


index only goes up to / for nonzero sensitivity of $$, . We get 


I 
(Method #2) __ 1 0$S 
A, - e (8.13) 
SN ep 1'=0 of 


The first method will produce A, that can easily be interpolated using futures 
contracts from the IMM dates, since the total swap dependence on a given rate is 
specified. However, the delta for the /" swaplet is then not given by A, . In the 


second method, the delta for the /" swaplet is given by A ı - It can be useful for 


risk management to look at both of these methods. 


A numerical IMM delta ladder can be produced by varying each forward rate 


7 (IMM) MM) 
m m 


rate curve and recalculating the swap”. 

A further complication is that for Libor other than 3-month (e.g. 6-month 
Libor or 1-month Libor), an appropriate change of variables has to be made to 
compare to the 3-month IMM ladder. 


at the corresponding IMM date , then reconstructing the forward 


Non-Parallel Shifts, the Gamma Matrix, and the Sick Gamma Ladder 
Unfortunately, many systems are incapable of dealing with gamma matrices, and 
incorrectly calculate only the diagonal elements у. A simple gamma ladder 


V uses only the diagonal elements, у, = У. This diagonal gamma ladder is 


however inaccurate, as we shall now see”. 
Let us (following these systems) incorrectly identify the total gamma with the 
diagonal elements only, 


У Diag Only ? Le. 


? So Which Delta Ladder does Your Report Show? In the text, we have exhibited 
several possibilities. Maybe the guy who programmed the risk system has just left the 
firm to become a junior trader. Did he document the formula that he programmed? Hmm? 


?* Sick Gamma Ladders and Risk Reports: Good traders know that the diagonal 
gamma ladder is not useful for math risk calculations. However, there are risk reports that 
obstinately show diagonal gamma ladders anyway. It is not clear how many people 
understand the problems with gamma ladders. In order to cure the diagonal gamma 
ladder, off-diagonal gamma matrix corrections have to be included (see the next section). 
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N 


swap 


У Diag Оту Е У. Уп (8.14) 


1=0 


This procedure clearly only works if the off-diagonal gamma matrix elements 
are small compared to the diagonal terms. This can produce misleading results 
with non-parallel yield-curve shifts. How can we see this? Imagine a forward rate 
shift scenario that goes up by 10 bp in the first bucket and then down by 10 bp in 
the second bucket. To a good approximation, the BE rate is unchanged and 
therefore the swap value is unchanged. However, a math calculation done with 


А, Уу for the first bucket and A,,y,, for the second bucket leads to a 


cancellation of the delta terms and an addition of the gamma terms, implying the 
incorrect conclusion that the swap value should change. 

Generally, in order to perform risk management with changes in yield-curve 
shape, some systems can typically only perform revaluation computations 
(“revals”) without the possibility of analytic calculations. 


Approximate Factorization for the Gamma Matrix into the Gamma Ladder 


Here is a trick to get the off-diagonal gamma matrix elements that enables 
reasonably accurate approximate results for non-parallel yield-curve moves to be 
calculated analytically". It is important to emphasize that this approximation 
works on a deal-by-deal basis only. It is given by 


Ie] Syur l (8.15) 


So, for a given swap, the off diagonal gamma matrix elements are 
approximately given by the square root of the product of the two diagonal matrix 
elements. In order to motivate this, note that у. involves the variation of the 


discount factors. The relatively simple form of the discount factors then leads 
after some algebra to the above approximate relation. We also need to insert the 
sign. We assume that the gamma matrix has elements all of the same sign. For a 
swap, this sign is equal to the sign of the notional. 

Using this approximation, we find, now for arbitrary parallel or non-parallel 
shifts in the forward rate curve, for a given deal, using eq. 8.11, 


swap Now yap 


$25 aca: = 9Ngo У, A Of, += | 2; ET "m -sen($N)} (8.16) 
1-0 


37 History: I discovered this approximation for the gamma matrix in 1991. It also turns 
out to work reasonably well for swaptions, even American swaptions. 
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Now notice that, as in the above scenario, if one rate is moved up and its 
neighbor rate is moved down, the gamma term (now correctly) is approximately 
zero since the rate changes roughly cancel out in the sum, as desired. 

In this way, it is possible to define a modified set of diagonal gamma matrix 


elements that incorporate the effects of the off-diagonal terms. In that way, an 


effective gamma ladder { yr " including the effects of the off-diagonal terms can 


be resurrected. The numerical results for gamma ladders in this book have been 
corrected to include off-diagonal gamma matrix effects. 


A Non-Amortizing Swap is an FRN and a Non-Callable Bond 


If f qm) is Libor and the notionals are constant, the swap can be rewritten. 


Formally adding and subtracting a fictitious notional payment at the end of the 
swap, and breaking apart the sums, we get the result for a pay-fixed swap written 
in terms of an FRN and a non-callable bond”: 


Swap Non-Amortizing, Pay-Fixed — FRN 7 Bond (8.17) 


Here FRN is a floating rate note containing the floating-rate part of the swap 
along with a notional payment at the end. It is easy to see that, at reset points, the 
FRN is at par. Between two reset points, the FRN is only slightly different from 
par (see below). Therefore, in this case, the risk of the swap can be obtained from 
the risk of a non-callable bond?’ with a coupon equal to the fixed rate of the swap 
and discounted with Libor. For a bond with a coupon including a credit spread 
with respect to Libor, the equation could be turned around to read 
Bond = FRN - Swapyon-amortizing, Pay-Fixed» Provided the fixed rate in the swap 


includes the credit spread. 


FRNs are at or Close to Par 
We consider Floating-rate notes or FRNs briefly. At the first reset date /, the 
definition of the FRN is (leaving off the notional and with rates in decimal), 


38 What's the Duration of a Swap? In Ch. 9 we discuss the duration of a bond (see Eq. 
9.2). Note that the bond value is in the denominator. A swap at par has zero value, so the 
duration formula for a par swap breaks down. Using Eq. (8.17) we can however quote the 
duration of the bond with which a non-amortizing swap is associated. 


? Noncallable Bonds: These bonds, also called “bullet bonds" are guaranteed, in the 
absence of issuer default, to pay coupons to maturity. Callable bonds contain embedded 
options giving the issuer the right, at any one of a set of defined times, to pay a certain 
amount to the bondholder and cancel the rest of the bond. See Ch. 9 for more information 
on bonds. 
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FRN"" (t,) = Ya pU) 41. play) (8.18) 


2 M S Arrears 


Here, the finite time interval between payments is dt, = t, — f,. The discount 


factor written in terms of the forward rates in money-market convention is 


(8.19) 


It is simple then to see the recursion FRN [~] (4) = FRNU (t,). On the 
other hand, at № = 0 (one cash flow f, dt, paid at /4) we immediately get 


FRN"! (t, ) = 1. Hence, at reset, the FRN is at par. 


Note that if the value date is before the first reset, there is an extra discount 
factor. Similarly, if the value date is between the first reset and the second reset at 


t, the FRN is not at par, but will be close to par. 


The reader should carefully note that the rewriting of a swap as an FRN and 
a bond is not valid for amortizing swaps, where the swaplets have different 
notionals. Further, in a later chapter we will examine index-amortizing swaps. 
These are amortizing swaps with complicated knockout features. Misleading 
results can be obtained using an incorrect FRN - Bond association of such more 
complicated instruments. 


Cross-Currency Swaps 
Cross-currency swaps" have principal exchange and interest-rate payments in 
different currencies, e.g. EUR and USD. To bring all cashflows into a "reporting" 
currency (e.g. USD), forward FX rates G(& USD / One EUR), obtained using 
interest rate parity (Ch. 5), are used to convert the EUR currency in which the 
EUR cashflows are denominated into the USD reporting currency. ^^! 


? Forward FX: See Chapter 5 for a discussion of forward FX rates. 


41 Cross-Currency Swaps — More Details: One distinction with a single-currency swap 
is that the notionals for a cross currency swap are paid/received at the beginning and then 
reversed at the end (in the different currencies). This can be done to match funding 
requirements for counterparties in different countries. If there is only one currency, only 
interest payments are used. Cross-currency swaps can get very complicated as cash flows 
can be converted into several currencies successively in the same deal. For example a 
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Here a generalization of dual curve stripping for interest rate swaps in one 
currency must be used, since there are forward rates and discount factors in each 
of the two currencies’. If there is collateral, OIS discounting is used. The Libor 
forward rates are determined to reproduce the single currency Libor swap rates. 

In addition, a “deal basis spread” is needed to obtain the market price of an 
existing cross currency swap. This spread (a single number) is added to each 
interest-rate forward of one of the swap legs. In a more refined analysis, prices of 
cross-currency swaps with different maturities can be used to get a term structure 
of forward basis spreads that are added to the forward rates of one of the legs. 


Credit Default Swaps (CDS) 


A credit default swap CDS" pays the CDS buyer іп the event of the default of a 
reference instrument, e.g. a bond of an issuer with entity “name” ABC. The 
payment is either par value in return for the defaulted instrument (physical 
settlement) or the difference between par and the market value of the defaulted 
instrument (cash settlement). Thus the holder of a bond can hedge the credit risk 
of the bond by buying a CDS. In return, the CDS buyer pays a regular fixed 
coupon to the CDS seller, in addition to a possible upfront payment ^^. 

CDS indices "^" exist, consisting of a number of entities. The index will pay 
off if any of these entities defaults. 


Consider a long CDS position. Call the CDS maturity date Т , the fixed 


coupon E (Fu) and upfront payment (as fraction of notional) {7 C) =$U/$N. 


Consider the forward time interval (E with time length? t TERMI 


probability of default n in (52) is related to the average instantaneous 


default probability (the hazard rate h, |) by 


swap could be in EUR and JPY but reported in USD. Extra complications involving FX 
hedging occur. See Beidleman (Ref.) 


4 Upfront Payments and Coupon Conventions: After the 2008 recession, upfront 


payments became more common to reproduce the market CDS valuations, and coupons 
became stylized (e.g. 100bp/yr or 500 bp/yr). A high upfront payment indicates that the 
market evaluates the bond default probability as being high. As we show later, however, a 
composite total spread can be defined to include any upfront payment and coupon. 


? CDS Spread: Sometimes the coupon is called a CDS spread. However see later when 
the total spread is also defined to include the effects of an upfront payment. 


^ Indices: Examples are the Markit CDX indices (ref). 
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(d) = h, тй RESI 


Pia = l-e (8.20) 


For large (small) hazard rate, the default probability goes to one (zero). 


Define у; у as the discount rate in (T,,T, ) and set hu m а: We 


T, 
denote the discount factor back from T, to today as p i) 
We need to calculate the total coupon paid up to the possible time of default 


t, in the interval (mn given that there was no previous default before T. 


The probability of no default in (Т. " from Eq. (8.20) is 1— р® = g dia 


where £,, —t,— T, is the time to default in FUA: Two approximations are 


made: (1) We set the hazard rate plus discount rate in H time interval to default 


A b equal to its average in (T, T i viz h7 (2) For discounting, 


I l+ h, Hl 


default (if it occurs) is set half way through (7, imas this can be relaxed. 


We need the following integral /,,, for the probability-weighted coupon in 


11+1 


the interval (7, T, ). discounted back to i We have 


I? pl 


Т (4) 
Ty = = ае -h, aty, У P 4+1 plu) / p i) (8.21) 
T, LM 


u 
Define the effective discount factor po Tit} АЁ w) p back to today. 


From Eq. (8.21), it is useful to define a aes ‘accrual factor” с, as 


(d) (d) 
c _ D, Jo D, +1 
Hel T (d) 
h iati = In(1 = р 


(8.22) 


^ Notation: The interval time length was called dt, previously in this chapter, but that 
notation would be confusing here. Also we drop the subscript “Arrears” for the discount 
factor, although discounting is generally done in arrears. 
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1-1 
А e . Survive __ (d) 
The survival probability (no default) up to 7, is ey = П(1- р; 22 The 
j-0 
discounted probability-weighted coupons paid in all intervals accrued up to 
default plus the upfront payment (per unit notional) is then 


EU Mc x) 4. gU os Survive plu) (8.23) 


“раа б, 1+1 іа disc 


With recovery rate Rh ) the probability-weighted amount received assuming 


recovery 


default half way through the interval is^ 


zx) Survive , (d) p l i} 
Received — RO) bos D, „1+1 ue (8.24) 


There is no factor c, , in Eq. (8.24) since no accrual occurs as with the coupon. 


For fair pricing (this is the risk-neutral no-arbitrage equilibrium statement), 
the amount paid equals the amount received on the probability-weighted basis. 
The coupon leg is the amount paid. The protection leg is the amount received. 


Coupon leg with upfront — Protection leg 


— Paid — Received 


zn) _ xn) (8.25) 


The market-implied default probabilities { 28] are determined for their 
respective time intervals LER such that Eq. (8.25) holds for the different 


CDSs of all maturities Ms This is done determining each рі“ od 


(bootstrapping), for 2: Pas successively. 


The set 22 is called the term structure of market-derived no-arbitrage 


implied probabilities of default“. 


46 Independence: These terms add since the probabilities are independent. 


“ Real-World Probabilities of Default: These are not the same as the market implied 
risk-neutral probabilities of default considered here. See Ch. 31. 
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Difficulties may exist in practice to find { р! } for stressed names". When 


this happens, different assumptions may have to be made for the recovery rates to 
obtain mathematically permissible implied default dui or 


^T») such that Eq. (8.25) 


average 


We can find an average probability of default р 


holds for а particular T... This is how the market actually trades, and pm 


average 


defines the “ISDA convention" y the probability of default". Recalling Eq. 
(8.20), the average probability p. d 


average 


is related to the average hazard rate h, T 


(4T. ) X RE 


average 


Полом 


over the whole period (¢,.7, v) by p e We also have 


pix) 
Pis = 
occurs, on average it happens half way through the whole period. 
The ISDA convention equation is like Eq. (8.25) but with only one term. Noting 


that pe =] and setting T, —> t» 7, , > Г, in the above equations we get 


Т, 
p y) , assuming again for discounting purposes, that when default 


‚ p) _ RO) 


болі ,N'0,N* disc = тесоуегу 


об) + gd 


average" disc 


Joco BER (8.26) 


(Tx) 


With the ISDA convention, we can define an equivalent total spread 5 
including both upfront payment and coupon, 


(Ty) 
499 = gin) к (8.27) 
P N 


Sone N' 0,N^ disc 


We can also write Eq. (8.26) using the total spread, viz 


Stot болі 0,N~ disc recovery average" disc 


st x) А0 ү) _ -(1- RE x) jugo (8.28) 


In the case of no upfront payment, so suy) = Е (n) we get: 


^5 Difficulties: Unphysical negative hazard rates or probabilities above one can result. 
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pil) =1—exp | т = (8.29) 
BON 
recovery 


It is instructive to calculate using another method, namely taking default to be an 
explicit knock out. We also have to add the possibility that there is no default at 
all. The two methods are equivalent, which follows from integrating an equation 


like Eq. (8.21) by parts. Using the ISDA convention over the interval (tT, AT 


ignoring discounting, inserting the coupon, and setting f, , =t, — f, we get 


10 N 


ia - f [v-au] 
0 


Лу = f С = g hs He = Еа, | = uv 
0 


MEUS (d;T,) 
m [E lo N l= P iverige No KO + 


toy 
T, -h 
Fee] Г» 
Р Coupon to fy y > © -Prob(KO) at f, 


We identify the first term as corresponding to not having a default knockout 
and the second term to having a default knockout. 


(8.30) 


Yes KO 


The Risky DVOI 


Consider the ISDA convention. We define DV Olay as 


DV Ol pic = ~il Conon leg with upfront |= 


tot 


E Protection le 
B» - = (8.31) 
S N 


tot 


= Go wow 


Forward CDS Spreads 


Forward CDS spreads are defined by formulating the equilibrium condition Eq. 
(8.25) in a forward time period. Similar algebra is used. 
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9. Bonds: An Overview (Tech. Index 2/10) 


Bonds are debt instruments of many different sorts issued to raise money. Issuers 
of bonds include corporations, governments, municipalities, and agencies. In this 
book, an issuer is labeled as ABC and an investor as X. Bonds are obligations of 
ABC to pay back to X the borrowed money (called the "notional" or "par" 
amount) at the maturity date of the bond and in some cases earlier. In addition, 
the bonds have coupons (so called because the investor used to clip off "coupons" 
as pieces of paper to get paid). 

The world of bonds is an extremely complicated zoo; few people are expert 
in more than one sector'. There are many good finance books and reference 
articles on each market, to which we refer the interested reader’. The outline of 
the rest of this chapter is: 


e Types of Bonds 
e Bond Issuance 
e Bond Trading 

e Bond Math 
Types of Bonds 


Bonds are classified in several ways. A short quick list follows. In the next 
section, we discuss some aspects of specific issuers. 


Fixed-Rate Bonds 


The most common type of coupon is the same for each payment in time, defining 
a fixed-rate bond. The amount of the coupon is determined by the credit rating of 
the issuer’, along with conditions in the market at the time of issuance. Although 


' Bond Land: There is no way that the world of bonds can be described in detail while 
keeping this book portable. The reader is referred to any bookstore with a finance section 
where you can spend hundreds of dollars to learn the details. 


? Credit Ratings: Credit rating agencies (e.g. S&P and Moody's) historically rated the 

credit of issuers through specialized procedures. Since the recession of 2008, the use of 

agency ratings is in flux. Naturally, there are different credit notations. Investment grade 

credits are defined as BBB or higher (S&P), and Baa or higher (Moody's). Non- 

investment grade (also called high-yield or junk) is defined as BB or lower (S&P), and Ba 
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the coupon does not change, the price of the bond can change with the market 
through the yield, defined below. 


Floating-Rate Bonds 


Some coupons change with time as determined by a rule, involving a changing 
rate or floating rate index. There are many such indices. The most common is 
Libor. Typically, a floating-rate bond will be issued at Libor plus a spread (e.g. 
Libor + 1%, or Libor + 100bp). Other floating-rate instruments are short-term 
money market notes based on, for example commercial paper (CP) ". Floating 
rate bonds stay near par, as discussed in the previous chapter. 


Zero-Coupon Bonds 
If no coupon exists, the bond unimaginatively is called a zero-coupon (ZC or 0C) 
bond. We call РЇЇ) the price of a ZC bond with maturity date T . Clearly, since 


he gets no coupon payments, the investor X will pay much less than the notional 
for one of these bonds (X buys the bond at a "discount"). Sometimes issuers do 
not want to pay coupons for cash-flow reasons. It is possible to decompose or 
“strip” a set of coupon bonds into zero-coupon bonds. There are government 
bond traders that look at and try to cash in on the (small) mispricings using this 
technique. 


Callable Bonds and Puttable Bonds 


Many bonds are "callable". This means that at certain dates the issuer ABC can, 
for prices, which are known up-front, demand that investors X sell back the bond. 
A few bonds are "puttable". This means that at certain dates, investors X can 
make the issuer ABC buy back bonds for pre-determined prices. Generally, 
puttable bonds are also callable. The call and put features mean that the bonds 
contain embedded options. The quantitative analysis of such bonds is 
complicated; we shall examine some of the features in this book. Some derivative 
products, notably swaptions, are connected with these bonds. 

A callable bond is worth less than a non-callable bond because ABC has the 
option to call the bond, reducing its value to the bondholder X. 


or lower (Moody’s). The credit ratings are regularly reviewed and changed if deemed 
necessary. Agency credit ratings generally change over a longer time frame, except when 
an issuer is clearly in difficulty. After the 2008 crisis it was determined that some credit 
agency ratings were much too liberal, especially on complex securities. 

Of course the bubble-crash with no buyers for these securities after the crash was not 
predicted by anybody. Traders have their own view on credit that may or may not agree 
with the rating agencies, especially on a short-term basis when a rating seems suspicious. 

Large sell-side financial firms have their own internal credit rating models. 
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Deterministic Coupon Changes: Step-Up Bonds 


Some bonds have coupons that change according to a rule. If the coupons 
increase with time, the bond is called a “step-up”. Step-ups appeal to bearish 
investors who believe rates will increase and want increasing coupons that keep 
pace without having to sell existing bonds and buy new ones. Usually step-ups 
are callable, so the investor loses price appreciation in bull markets when rates 
drop and the bonds are called". 


Bonds Depending on Other Markets 


Some bonds have coupons that depend on other markets besides interest rates. 
For example, equity-linked bonds pay more if a given stock or index increases. 
Some structured notes pay a lower coupon in return for a potential equity gain. 
Other possibilities include bonds whose coupon depends on a commodity (e.g. 
gold), or on the value of an FX exchange rate (e.g. Japanese Yen vs. USD), etc. 


Convertible Bonds 


Convertible bonds (“converts”) combine both interest-rate coupons and potential 
equity” ?. Converts сап be exchanged for (or "converted into") equity. Converts 
have a wide array of complex side conditions. Lower-credit issuers often use 
convertibles. Simple versions called PERCS, DECS, etc. also exist. We will look 
at some possibilities in Ch. 14. 


Mortgage-Backed Securities 


Mortgage-backed securities (MBS) are the repackaging of homeowner mortgages 
of various sorts into securities. MBS have non-deterministic coupons. This is 
because the coupons progressively disappear as homeowners increasingly prepay 
their mortgages, thus removing the underlying collateral. The determination of 
expected prepayments along with their uncertainties is extremely complicated 
and involves many variables“. Complex mortgage products called CMOs 


? Convertible Bonds: Convertible bonds have a lower coupon than ordinary bonds of the 
same credit. To compensate the investor receiving this low coupon, convertibles contain 
the implicit option to be converted to shares of stock under certain conditions. A simple 
picture of a convertible consists of an ordinary bond plus a stock option. The conversion 
option value requires analysis on each possible stock price path at each time in the future. 


^ Mortgage Prepayment Modeling: The subject of prepayment modeling is best thought 
of to first approximation as a complicated mix of phenomenological wizardry, 
accompanied by fitting large amounts of data. The goal of a prepayment model is a 
parameterization of the historical behavior of people regarding the financing of their most 
valuable asset (their homes), and human behavior is not easy to model. A prepayment 
model may be applied to individual loans (if the data exist) or for groups (“pools”) of 
loans of a given type from a given issuer. Prepayments for agency and non-agency 
mortgages are different. Because prepayments depend on the housing market, the home 
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(collateralized mortgage obligations) generally have many “tranches” with 
different characteristics and complicated payment logic*. 

This book does not cover MBS. There are many excellent references, to 
which we refer the reader ". 


Asset-Backed Securities 


Asset-Backed Securities (ABS) are the repackaging of anticipated cash flows 
from different sorts of assets into securities. These include auto loans, credit 
cards, home equity loans, equipment leases, student loans etc. Theoretically, any 
potential set of cash flows with enough certainty could be repackaged.’ 


Munis 


Municipal or “muni” bonds are issued in a large variety of types, short and long 
term, with different funding goals by local and state entities, depending on their 
capital needs. The interest is generally exempt from federal tax and may be 
exempt from state and local tax’. 


Non USD Bonds 


Bonds can be issued in different currencies besides the US dollar (USD), for 
example British pound GBP, Euro EUR, Japanese yen JPY, etc. The choice of 
the currency depends on the markets and the appetites of investors in various 


vii 


countries to buy the bonds". 


appreciation (or depreciation) may be modeled either using a simple scenario model or a 
complicated stochastic model. A prepayment model may also use the consumer credit 
rating of the homeowner and previous homeowner payment history. 


? Impled Prepayments: When pricing mortgage products on the desk, the prepayments 
from a model may be changed to “implied prepayments”, chosen to fit the market price. 


* CMO Logic : There are entire systems to analyze the cash-flow rules for the different 
tranches in CMOs. This must be done painstakingly from the contracts. Prepayments 
affect different tranches in very different ways. Rules include the relative amounts of 
interest or principal and the ordering of the cash-flow payments into the various tranches. 


"Solar ABS? Recently ABS backed by solar energy receipts has been proposed. See Ch. 
53 for a discussion of finance, energy, and climate change. 


* Muni Bond Land: This is a complex zoo. Muni bonds include General Obligation 
(GO) Bonds , Revenue Bonds (housing, utility, health care, transportation, industrial), 
Municipal Notes (TANs, RANs, GANs) etc. Some bonds called private activity bonds are 
not tax exempt but are subject to the alternative minimum tax AMT . 
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Guaranteed bonds 


Some bonds have guarantees (one historical example is Brady bonds"" for 
emerging-market countries that had partial guarantees on some of their coupons). 
Some mortgage and muni bonds are also guaranteed. The investor pays an extra 
premium to cover the guarantee. 


Loans 


Banks make loans to clients. Bonds are generally riskier than loans. This is due to 
the generally longer maturity of bonds relative to the loans that the banks are 
willing to make, and due to the often-lower credit rating of the bond issuers 
relative to those corporations to which banks are willing to lend. Loans can also 
be repackaged into securities. 


Bond Issuance 


A broker-dealer BD may participate in facilitating the issuance of various types 
of fixed income securities. Here for illustration are the data for U.S. bond 
issuance for the first half of 2002 along with a little commentary to give some 
flavor’. The specifics naturally change with time. 


Illustrative Bond Issuance Data 


Type of Issuance Amount Issued ($B) 
U.S. Treasury $250 B 

Federal Agencies $450 B 

Municipal $200 B 

Corporate $400 B 
Asset-Backed $225 B 
Mortgage-related $1,000 B 
Commercial paper $1,325 B 


Treasury Issuance 

U.S. Government Treasury issuance depends on tax receipts and thus on the 
strength of the economy, the debt ceiling, government spending, and projected 
budget deficits. 


? Data Source: Research Quarterly, The Bond Market Association report, August 2002. 
Issuance data are rounded off to the nearest $25B (billion) and trading data to the nearest 
$10B. Data are for the first half of 2002. Some of the issuance commentary in the text is 
from this source. 
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Agency Issuance 


The bulk of agency issuance? is from Freddie Mac , Fannie Mae , and the 
Federal Home Loan Bank . Note that this issuance is debt - 1.e. bonds bought by 
investors funding the agencies. 


Muni Issuance 


Muni issuance in 2002 was at record levels following recent turbulence in the 
equity markets along with recent low rate levels. The refunding of bonds (from 
higher to lower coupons at current low interest rate levels) was up along with 
new issuances. 


Corporate Issuance 


Corporate issuance is sector dependent (e.g. telecommunications, manufacturing) 
and highly dependent on credit. Most corporate issuance is investment grade. 
High-yield issuance is an order of magnitude less than investment grade issuance. 
Convertible bond issuance is a small fraction of the total, and it has been 
decreasing. Overall, issuance in 2002 was down partly because some 
corporations were hurt by scandals. 


Asset-Backed Issuance 


ABS issuance depends on investor demand, which in turn depends on relative 
yields and perceived risk versus other bond markets. 


Mortgage Issuance 


Most mortgage issuance is through the agencies (FHLMC and FNMA, with 
GNMA somewhat less); there is also a small non-agency or “private label” 
component. Issuance of these mortgage "products" depends on the collateral of 
homeowner mortgages. The increasing prepayment of old mortgages and 
concurrent increased refinancing by homeowners in the current low rate 
environment led to more MBS/CMO issuance in 2002. Following the housing 
crisis of 2008, the future of these agencies in the mortgage market is in flux. 


10 Federal Agencies Issuing Debt: Freddie Mac is FHLMC (Federal Home Loan 
Mortgage Corporation), Fannie Mae is FNMA (Federal National Mortgage Association), 
and FHLB is the Federal Home Loan Bank. Other smaller agencies issue smaller amounts 
of bonds. These include Sallie Mae (dealing with student loans), the Farm Credit System 
FCS and the Tennessee Valley Authority TVA. Ginnie Mae or GNMA (Government 
National Mortgage Association) issues mortgage-backed securities, but not debt. 
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CP Issuance 
CP issuance was negatively affected by concerns over issuer credit quality. 


Bond Trading 


The secondary market is the market for trading bonds after they are issued. A 

broker-dealer will have trading desks in the various bond markets, with different 

traders specializing in different narrow sectors. Each market is highly specialized. 
To give an idea, here are the trading volume data’ for the first half of 2002. 


Illustrative Secondary Trading Volume Data 


Market Daily trading volume ($B) 
U.S. Treasury $350 B/day 

Federal Agencies $80 B/day 

Municipal $10 B/day 

Corporate $20 B/day 
Mortgage-related $140 B/day 


Trading, Flight to Quality, Diversification, Convergence, and Asteroids 


Flight to Quality 


A flight to quality occurs when investors take refuge in treasuries from risky 
assets in stressful markets. People generally try to escape risk by diversification, 
buying instruments in different sectors or different markets. There are also 
convergence plays or trades in which you “just have” to make money when two 
different instruments eventually “must” become identical. This idea works except 
when it doesn’t work. 

A stressed market, if severe enough, can bring on collective panic!'. We 
might pictorially call the onset of a particularly crazy stressed market as due to an 
“asteroid”. Diversification strategies can be roiled in an asteroid-panic 
environment. This is because, no matter what the instruments, investors want to 


'' Phase Transitions and Collective Panic : Theoretically, the flight to quality can be 
thought of as a phase transition, where we go from a highly disordered state (buyers and 
sellers of various securities in comparable numbers) to a highly ordered state (basically 
only sellers with buyers waiting). One possible framework is the critical 2"°-order phase 
transition in the Reggeon Field Theory, discussed in Ch. 46. Other possibilities can form 
a rich area for research. 
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shed risk. Selling pressure increases dramatically as more and more people give 
up strategies and pursue the flight to quality. All instruments except the most 
risk-free (generally treasuries) drop in value, and diversification fails. 

Traders can be hit badly in stressed markets along with investors, losing 
buckets of money. To a first rough (but not misleading) approximation, traders 
cleverly buy relatively cheap but potentially risky products and sell treasuries for 
hedging. When the asteroid hits they lose twice—once because their risky assets 
(that cannot find buyers) drop in value, and again because their short positions in 
treasuries drop in value. Convergence plays can also get demolished before the 
theoretical convergence can take place. Before being disbanded, the sophisticated 
Salomon Arb Group was a victim of such phenomena. Other victims of the same 
1998 disaster included various hedge funds, notably LTCM *. 


Bond Math 


In this section, we give some basics on “bond math". The reader should be 
warned that the practical details are messy. 


Discount Factors 


Discount factors are essential to understand because they describe, once a cash 
flow is determined in the future, how much the cash flow is worth today. The set 


of zero-coupon bonds ee are the set of discount factors. The zero-coupon 


bond pu equals today's value of a future cash flow of amount $1 to be paid by 


issuer ABC at a time T in the future. Conversely, 1/ pO tells you how much $1 
invested today will be worth at time 7' if invested in an interest-bearing account 
with rate typical of the ABC coupon. Theoretically, ta arise from stripping 
coupon bonds of different maturities. Since in general there are only a discrete 
(and sometimes small) number of bonds for a given issuer, aggregation is used to 
get discount factors for a given credit rating in a given sector. Interpolation 


schemes must be adopted in order to calculate the discount factors for arbitrary 
maturities between the known ZC bond maturity dates. 


Yields 
Bond prices can be recast equivalently as "yields". The yield y is a common rate 
to be used in all discount factors for all coupons in order to produce the given 
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price of the bond". A quick mnemonic is "yield up, price down". What this 
means is that if rates go up and the coupons of newly-issued bonds increase, 
investors will pay less for an already-issued bond because these pay a lower 
coupon. 


Take a coupon bond paying f coupons per year, with the j " coupon to be 
paid at date T a time interval 7, years from today. There are “day-count 


conventions" for what constitutes a year". The yield has the attribute of the 
frequency f compounding, and the discount factor for that cash flow is 


T; T. 
РЇЇ) = [1 +y/f | "s . Note as f — oo we get continuous compounding, i.e. 


Т. 
РІ?) > exp(—yz, ) . The price BP. of the ABC bond of maturity T , a time 


interval T from today, with annual coupon c „ьс, is the sum of all the discounted 
cash flows: 


N 


BD (y Ех ПГ 00+ рТ" (9.1) 


j-0 


The / = 0 accrued interest term is omitted for the “clean price", and is 


» 15 


present for the “dirty price” ? with the discount factor omitted'®. The j =1 term 


? Other Yield Conventions — nominal yield, yield to maturity, yield to worst: There 
is also a “nominal yield”. This is the annual coupon and a “current yield”, which is the 
annual coupon divided by the price in decimal. If the bond is callable, the coupons up to 
the first call date are employed to give the “yield to call” YTC. The “yield to maturity” 
YTM is the definition in the text. The “yield to worst” YTW is the minimum of YTM and 
YTC. For a premium bond called at par, YTW = YTC and for a discount bond YTW = 
YTM. For munis, by regulation the YTW must be the yield quoted to clients. 


P? 30/360 Day Count: This assumes that there are 30 days per month and 360 days per 
year, as described in the preceding chapter. 


14 More Bond Conventions: The par value of a bond is 100 as in Eq. (9.1). However the 
face value of a bond is typically $1,000 for USD bonds. Just another reason to label your 
spreadsheet clearly. 


'S Clean or Dirty Price? The clean price is more stable than the dirty price, and is the 
quoted market price. 


16 Accrued Interest: Accrued interest is calculated from and including the last interest 
payment date, up to but not including the settlement date of the trade. Settlement (regular 
way) is t + 3 days for corporates and t + 1 day for US governments, where t is the trade 
date. Accrued interest is paid to the seller of the bond. 
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has the first full coupon. The last N coupon is paid at maturity. In real 
calculations, there are a myriad of details”. 


Duration and Convexity 

The duration D and convexity C of a bond with price B are defined as: 
LOB 
Boy 


(9.2) 


The duration, defined with the minus sign, is positive. Note that the duration, up 
to a factor, is the weighted average time of payments'®. To second order, we get 
the relation 


ôB -|-08у+ 1С(буу |в (9.3) 


The DVO01 is the change in the bond price 6B for a one bp/yr change in 
yield, viz ôy = 107 / yr . This includes all changes in the bond price, including 


any changes due to embedded options in the bond if they exist. The DV01 is 
generally defined numerically, since call features in bonds and other complexities 
cannot be described analytically. The conventions for the actual number of basis 
points moved vary. If the move dy is too small, the numerical algorithms can 


become unstable and/or produce unphysical numerical jumps". If the move is too 
big, large second order convexity effects enter. For example we can move 
Sy = 50bp/ yr . Then the change óB is scaled back down by бу to get the 
DVO1. 


" Bond Conventions: These can get very messy and depend on the contract details of the 
bonds. In particular, bonds have a variety of conventions determining the actual interest 
payments. To get a complete description of the complexities, consult The Bloomberg (hit 
the GOVT button and then type DES and HELP). You will get around 45 pages listing 
600 conventions for calculating interest for government bonds in different countries. This 
is cool. 


5 Other Durations: Macaulay duration is defined to include a factor (1+y/f). 


? Numerical Instabilities and DV01: Numerical code involves discretization and if the 
yield change is too small, the code gets two nearly equal prices. The price error may be 
relatively small, but the difference between the nearly equal prices is small enough to be 
very sensitive to these price errors, sometimes unfortunately resulting in instabilities in 
DVO01. If this happens, increasing the shift dy can help. All this is hard to explain to some 
people. 
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Convexity and Callable Bonds 


Convexity for callable bonds is tricky. There are two competing factors, the short 
call option in the bond (negative convexity) and the discount factors (positive 
convexity). For low bond price, the callable bond becomes non-callable (the 
issuer would never pay the required price near 100 for a bond worth less), so the 
convexity is determined by the discount factors, and is positive. For high bond 
price, the call option is not out of the money and the convexity of the short call 
option dominates, so the bond has negative convexity. 


Spreads 


Universally, the world of bonds is described by their spreads. Differences 
between a bond's yield and the yield at the same maturity for a given benchmark 
(e.g. government or Libor) define that bond's spread. Typically, spreads are 
quoted for a given credit and a given sector (e.g. BBB US Industrials). 

The spread contains all information about the price, given the benchmark. 
Therefore, all the effects determining the bond price enter in the spread. These 
include the perception of risk due to potential default and credit downgrades, 
technical supply/demand factors, market psychology, and generally all the 
information used by bond traders”. 


Option-Adjusted Spreads (OAS) 


The option-adjusted spread (OAS) for bonds with embedded options is defined as 
a calculated spread added to the benchmark curve, such that the bond model price 
is the same as the market price. The benchmark curve is the curve off which 
spreads are defined, e.g. US Treasury or some other curve^'. The model needs to 
use whatever logic is needed to include the embedded options. The model 
numerical algorithm can be a formula, a numerical calculation using a discretized 
lattice, a Monte-Carlo simulation, etc. Given the benchmark curve, the OAS is 
therefore a translation of the bond price, including the effect of the options. 
Callable bonds, mortgage products, etc. are commonly quoted using OAS. 
Duration and convexity are often defined such that the OAS is held constant. 


? Spreads and Implied Probabilities of Default: Some analysts assume that bond 
spreads are entirely due to the possibility of default. Actually, the logic is turned around 
to get “implied probabilities of default" Pimp defau from the spreads. This is done using 
bond spreads in the discount factors along with logic that eliminates bonds that happen to 
default in the future based on the probability Pimp defaut. Then Pimp defaut i$ varied until the 
bond price is obtained. However, historical statistics give actual default probabilities that 
are much smaller than the theoretical probabilities Pimp default- 


See Ch. 8 for a derivation of implied default probabilities from credit default swaps. 


?! Reference curve for OAS : This is a choice that defines the OAS. The reference curve 
might or might not include spread risk. 
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General, Specific, and Idiosyncratic Risks 


Risk can be classified in successive degrees of refinement. If an average risk is 
taken over many bonds, such as a large bond index”, the risk is called “general”. 
As refinements are made, narrowing the focus to an average over bonds in a 
sector, the risk becomes more “specific”. If the details of a specific issuer are 
specified, the risk is highly specific. The difference between specific and general 
risks is also a form of "idiosyncratic risk". The problem in getting idiosyncratic 
risk essentially is that the market prices are not known for all bonds. Hence, 
various approximations have to be made. 


Bond Matrix Pricing 

There are various approximate pricing methods. One is called "matrix pricing" 
where known prices of some bonds in a given sector are marked on a matrix of 
coupon vs. maturity, and interpolation is used to get other prices. 


Bond Factor Models 


"Factor models" assign components of bond spreads to various issuer credit and 
sector characteristics etc., and thereby arrive at an approximate theoretical price 
for a bond. This is done by “cross-sectional” regression to find the factor model 
parameters using bond price data for a given time, and then applying the factor 
model to all bonds of a given type. 
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10. Interest-Rate Caps (Tech. Index 4/10) 


Introduction to Caps 


In this chapter, we discuss interest-rate caps. Caps provide insurance against 
rising interest rates by paying off if rates go up enough. The picture below gives 
the idea for one piece of a cap, called a caplet. 


Caplet Kinematics, one forward rate 


Diffusion of forward rate Í, starts with value I (6) at time f,, and 


ends at maturity t . The strike of the corresponding caplet is E . Paths 
in/at/out of the money are notated as ITM/ATM/OTM, respectively. 
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We first discuss standard Libor caps, and then go on to discuss Prime caps 
and CMT caps. Other rates (e.g. CP, Muni) are also used’. We present the 
standard market model that uses lognormal forward rates. The new ingredient for 
options is, of course, volatility. 


The picture above illustrates a particular Libor forward rate f, (t) diffusing 
from the present time ¢, to its maturity t with some volatility o,. The set of 


volatilities loi] is called a “term structure of volatilities” for the forward rates. 


While many possible interest-rate processes have been suggested, volatilities 
as traded in the market are quoted in lognormal-rate language. This means that 


o; dt = (Ls (t at) – f, (г) / 5 (oy) , averaged over stochastic variability, 


in time dt. In addition, c, could depend on і. Volatilities for other processes 


have different definitions and different units. Sometimes model volatilities are 
used for hidden variables before a transformation is made into physical rates!. 


At the Money Options and the Trader on a Stick 
In the above figure, we show the strike E of a hypothetical call option on the 
forward rate f, along with three paths, ending in the money (ITM), out of the 
money (OTM), and at the money (ATM). 
We might irreverently call the ATM path the “frozen trader-on-a-stick раф”. 
As we have seen, a swap is a collection of swaplets. In the same way, a cap is 
a collection or a basket of caplets. The /" caplet is priced using its own volatility 
o,. This volatility is placed in the lognormal process of the forward rate 


corresponding to the caplet option expiration date Bb From the lognormal 


' Different Models, Trading, Reporting: Sometimes one model will be used for trading 
and another (usually simpler) model for reporting. This leads to inconsistencies in the 
way the desk views its risk (using the trading model) and the way the risk is reported to 
corporate risk management (using the reporting model). Sociologically, the desk does not 
care much about the reporting model, while risk management 1s often concerned about 
consistency. Discussions backed up by authority may be needed for resolution. 


? Frozen-Trader-on-a-Stick Near Expiration ATM: This path winds up very near at 
the strike at expiration. Near expiration, the hypothetical trader sits frozen, since he does 
not know whether to lift the hedges as needed for OTM where A = 0 or keep the hedges 
as needed for ITM where A= 1. The quant would say that near expiration and at the 
money, gamma (the change of delta for small changes in rates) becomes very large. The 
trader would not be amused. 
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assumption for f, (t) , it follows that the Black model with the caplet volatility is 


applicable). The cap value is the sum of caplet values, $C. = 5 96 n 
1 


The Black Caplet Formula 


The classic Black formula for the /" caplet value evaluated at time to is 
2) lagal 7) (i) 
$C, / sly 2 $y f N(d,,) — EN (d, ) "uo Ds (10.1) 


a(t 
Here f ti) =f, (t; ) , Where we use carets ^ indicating that the forward rates are 


determined today. The time interval to the caplet maturity is z,. $Å is the /" 


notional, in cursive to avoid confusion with the normal functions N(d,,) . Also 
e). TP 
P is the zero coupon bond price for discounting in arrears (payment made 


Arrears 


after reset), dt „„ is the time interval between resets (1/4 year for 3 month Libor), 


E is the rate strike (which can also depend on the caplet), and 
alt 1 
di. 0) аг ә (10.2) 


Floors and Floorlets 


Interest-rate floors are collections of floorlets. Floorlets are put options on rates 
and have corresponding formulae within the context of the Black model. 


? Black Formula for Caplets: This is the Street-standard model. To motivate it, recall 
that forward rates and futures are linearly related, up to a convexity correction. Ignoring 
margin accounts, there is no cost for futures. Futures are not assets since they cost 
nothing to buy. Therefore, the no-arbitrage drift of changes in forward rates is close to 
zero, producing the Black formula. Equivalently, we can use the Black-Scholes formula 
with a “dividend” yield exactly canceling a “risk-free rate". 


^ Homework: We spare the reader the details of the mathematics at this stage of the 
book. However, the reader may want to derive the caplet and floorlet Black formulae. 
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Cap and Caplet Implied Volatilities 


А cap implied volatility (or vol for short), о), is a single volatility to be used 


cap 
in each Black formula for every caplet in that particular cap. The cap implied vol 


(impl ) 
cap 


have different cap implied vols. 

Brokers quote implied cap vols using the Black model. The quotes are 
generally a composite of the deals done in the market that day. Not all maturities 
in general will have traded. This is especially true for longer maturities, and for 
caps on rates other than Libor, which are less liquid. The cap vols will also 
generally be quoted for at or near the money options, since these are the most 
liquid. Model pricing and extrapolation techniques are used for options away 
from the money. 

Cap and caplet implied vols are not the same. Caplet implied vols are the 


o just serves as a proxy for that cap price. Caps of different maturities will 


market-determined values joy | of the volatilities fo} for the individual 


forward rates. The same caplet implied vol is used for each cap containing that 
caplet. 

A procedure to obtain a term structure of volatility for the caplets with 
different volatilities loi" "| is employed, varying the caplet volatilities until the 
market cap prices are obtained. In order to do this a recursive procedure can be 
used starting with the shortest maturity cap to get the volatility of the shortest 
maturity forward rates, and then subsequently progressing outward in maturity to 
longer maturities. Here is a graph of an illustrative caplet vol term structure for a 
5-year cap on 3M Libor, taken from my CIFEr tutorial: 


Caplet Volatility Term Structure 


30% 


25% 
15% 


10% 
5% 
0% 


_ с ive) N о = со ite) N о 
т- _ y у= 


Caplet number 
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Approximate Analytic Formula for Cap Implied Vol 


A useful analytic approximation for the implied volatility of a cap involves 
averaging, with the weights being the caplet values’. It reads 


are “$C, 
обй") mi (10.3) 


cap 


Cap (Caplet) Vol Skew 


Just as for equity and FX options, the volatilities used to price a cap with the 
Black formula need to depend on the strike in order to get agreement with the 
market price. This defines a “cap vol surface". Caps of different maturity can be 
used to back out equivalent skews for caplet vols, producing a “caplet vol 
surface". 


Non-USD Caps 


In different currencies, the implied volatilities for a given maturity will be 
different. Insurance costs against rising interest rates in, say, Germany are not 
the same as insurance costs against rising US rates. This follows since the 
macroeconomic situations determining the long-term behaviors as well as the 
technical factors determining the short-term behaviors are in general different for 
Germany and ће US*. 


? Implied Cap Vol Formula: This approximation works reasonably well even with 
different notionals for the caplets. The physical motivation is that the cap vol is controlled 
by the dominant caplet vols, but only if the caplet values (that depend on the rates and 
strike) are significant. Hence, both caplet vols and caplet values enter. The formula is 
only applicable for one deal at a time, and in particular does not hold for portfolios of 
caps including long and short positions. 


° Non-USD Rate Caps: Conventions differ. For example in the US, new caps generally 
start with the first reset in the future (e.g. in 3 months to be paid at 6 months). However, 
in some currencies the convention is that new caps start with the first reset determined 
now (with the rate known), and with this known cash flow paid (e.g.) at 3 months. 
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Relations between Caps, Floors, and Swaps 


Put-Call Parity for Caps 

A put-call parity relation says that the price of a cap minus the price of a floor 
with the same “kinematics” (resets, notionals, etc.) is the price of a plain-vanilla 
pay-fixed swap’. This is the analog of put-call parity for equity or FX options. In 
this case it reads 


С 0 


Floor 


= 15 


Pay-fixed Swap 


ap — (10.4) 

Now a Libor swap has no volatility dependence. Hence, the implied 
volatilities of caps and floors should be identical, provided the deal kinematics 
are the same. Illiquidity and technical supply-demand factors enter into the real 
world, breaking the theoretical dictum. So in practice, floor implied vols are not 
exactly the same as cap implied vols. 


Limiting Relations 

As interest rates rise significantly, a cap becomes deep ITM, while the floor with 
the same kinematics becomes deep OTM and worthless. Therefore, a cap 
approaches a pay-fixed swap as interest rates rise". 


Hedging Delta and Gamma for Libor Caps 


We have already looked at Libor interest rate swaps. The interest rate risk 
measured by A and у for interest rate caps is treated in the same way as for 


swaps. The same comments regarding the A ladder and the gamma matrix, along 
with the hedging instruments, hold here. 

The delta hedging of caps 1s done with a combination of futures, swaps and 
bonds (treasuries). Futures can in general only be transacted cheaply in batches of 


7 More Homework: You'll remember the put-call relation better if you derive it yourself 
than if you just read about it. Hint: N(x) + N(-x) = 1. 


* A Story: Swaps, Caps, and Objects in a С“ System: Once upon a time there was an 
industrial-strength C™ object-oriented vendor derivatives system, whose object model 
specified two objects for each derivative. One object was attached to each leg of a 
fixed/floating swap. There seemed to be only one object for a cap, so the cap was 
assigned one object, and cap's dollar price was used as the other object. When it was 
pointed out that deep ITM a cap, assigned a single object, turns into a swap, which 
required two objects, it was clear that the object specification was inconsistent. The 
system developers were startled. Naturally, it was much too late to do anything about it. 
What lessons can we draw from this story? The answer is not at the back of the book. 
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100 and are mostly liquid only for short maturities—especially in non-USD 
currencies. Swap transactions can be costly, and bonds are subject to repo rate 
risk. Long-dated transactions are the most problematic. For these reasons, exact 
delta hedging is impossible in practice. 

As explained in the previous chapter for swaps, ladder or bucket risk reports 
are generated giving the hedging mismatches as a function of maturity. These 
reports can take several forms, equivalent in content, but emphasizing different 
points of view. Examples include ladders in forward rates, ladders in swap rates, 
ladders in zero-coupon bonds etc. Some traders and managers get used to looking 
at certain reports and prefer those. Therefore, it is useful to become familiar with 
all the reports. 


Hedging Volatility and Vega Ladders 


We naturally have vega ladders to describe the details of the volatility 
dependence of interest-rate products. Each caplet has its own vega, defined as the 
sensitivity of the caplet value to its own volatility. These vegas correspond to the 
maturities of the caplets. An illustrative vega ladder for a 5-year forward cap: 


Vega ladder 
—a— Vega ("futures equivalents") 
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Caplet 


The vega normalization here is in “futures equivalents”, defined as one future 
equivalent = $2500, for a change in vol of 1%. If a caplet volatility о, changes 


by до, in %, then the caplet value $C, will change by 
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$dC, = (=) -Vegal "9 . бе| (%) (10.5) 
0 


For example, if the total vega for this cap 1s 53.6, and if the volatility goes up by 
196 for all caplets, the change in the value of the cap is $134,000. 
By linear interpolation, these caplet vegas corresponding to the caplet 


expiration dates | t| can be mapped onto the IMM dates Lud . The use of 


m 


the IMM dates has an advantage at the short end, because pit options can be used 
for hedging purposes. Pit options in the US are options on Eurodollar futures, and 
have IMM maturities. They are described like caplets and floorlets”". There are 
no natural hedges farther out in maturity. 

Sometimes swaptions (options on entire swaps) are used to hedge caps 
(baskets of options on single forward rates)'°. We look at swaptions in Ch. 11. 


Implied Vols, Realized Vols, and Hedging Strategies 


(impl ) 
J 


(rea lized ) 


The implied caplet volatilities o and the realized rate volatilities o; 


(i.e. the actual volatility of the f, (t) rate as time progresses) are related. This 


relation however is not perfect and it exhibits instabilities. It is sometimes said 
that an implied volatility gives the "market expectation into the future” of the to- 
be-realized volatility of the forward rate. 

However, implied option volatilities are driven by supply and demand. 
Moreover, implied vols possess some strike dependence (skew). Such 
complexities for implied vols do not have much to do with realized volatilities. 

The hedging and desk strategy for interest-rate options requires considerable 
empirical skill. In the canonical example, a trader tries to buy volatility cheap 
relative to his projections of implied or realized vol levels. In addition, because 


? Pit Options: A put on a ED future is comparable to a caplet, which is a call on a 
forward rate. A pit option is also described by Black formula with some normalization 
differences. In particular, a pit option does not have the dt,,, factor. This is because a 
future has no units, whereas a rate has units = l/time. In addition, pit options are 
American options, which gives an extra (small) premium for the possibility of early 
exercise. The European option approximation is reasonable for risk purposes. For details, 
see Hull’s book (Ref). 


'° The Dangers of Cross-Volatility Hedging: Trying to hedge caps with swaptions can 
backfire when forward rates become decoupled from swap rates. For example, as we saw 
in the last chapter, the forward rates can change in such a way that the swap rates do not 
change. There are well-known examples of blowups in hedging procedures that run 
across volatility types. For this reason, even if volatility risk is aggregated for corporate 
risk reporting, it is best to keep track of each type of volatility risk on the desk separately. 
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models are simplified, model hedges may be followed only approximately by 
experienced traders. 


Bid/Ask Vols and Illiquidity 


Bid vols o 


„a and ask vols c, correspond to buying and selling price levels. 


The mid vol 6,,, is the (bid, ask) average. In order for these vols to be well 


defined, the caps actually have to trade in the market. The bid-ask vol spread 


O pread = Сак ~ Fria Will be well determined for liquid caps that trade often and 
not well determined for illiquid caps''. 

For reporting purposes, different philosophies can be adopted. Vol can be 
marked at the mid level and a reserve taken for the amount projected to unwind 
the position. The reserve can be built into the reporting directly by, for example, 
marking the long vol positions to the bid vol (thus under valuing with respect to 


the mid vol). 


Matrices of Cap Prices 


For convenience, matrices of cap prices can be constructed with different strike 
levels and maturities. Separate matrices will exist for bid, ask and mid vols. The 
model is used for the interpolation, since only a small fraction of the caps in the 
matrix will be quoted in the market on a given day. Naturally, similar remarks 
hold for floors. 


Prime Caps and a Vega Trap 


The Prime rate is the interest rate that US banks charge to their most creditworthy 
customers. The Prime rate generally changes across the banking industry, 
depending on macroeconomic conditions, when a major bank decides to change 
it. Prime caps provide insurance against increases in the Prime rate. Now because 
the Prime rate is only changed sporadically, it has a behavior that does not look 
like diffusion at all. Rather the Prime rate has a step-like behavior where it is 


'' Bid/Ask Vol Illiquidity Problems: For illiquid vol products, it can happen that only 
one side of the market exists. For example, dealers may only be selling a given product to 
clients but not trading the product with other dealers. In this case the ask vol is known but 
the bid vol is not known (vols are from the point of view of the dealer). Hence, the mid 
vol is not known either. In that case, some assumptions have to be used. If suddenly the 
dealer has to buy back this vol because of strategy changes or whatever, the real bid-ask 
spread may be nowhere near the assumption. 
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fixed for a relatively long time on each step". Each step can last for macroscopic 

times—e.g. months—during which time the Prime rate is unchanged. 
Nonetheless, Prime caps are priced assuming a Prime rate diffusion process 

with a Prime volatility. Model assumptions are used to get the Prime volatility 


©зше. Typically тр, is related to the Libor vol с. for equivalent maturity. 


There are two common methods for pricing Prime caps. Both use the Black 
model with lognormal volatility, but with different parameters. These аге”: 


R, 
.. ^ Libor : : 
1. Use O Prime = O Libor where Кре Бо Ко 23 Sprime_ Libor is the Prime 
Prime 
rate given by Libor plus the Prime-Libor spread. This assumes that the Gaussian 


Prime Libor 


vols o; апа o; given by changes in Prime and Libor rates, using the 
2 
formula o¿dt = (К: (t t dt) -R (t) | | ‚ are equal, and therefore the lognormal 


vols are multiplied by the rates. The strike Е is as specified in the Prime cap. 


Prime 
2. Use the Libor vol с, ,„„ and Libor rate R, 
Libor strike obtained by subtracting the Prime-Libor spread from the Prime 
К, Equi 
strike, Еу" =E 
raising the rate. However, the vol prescription is different than in method 1. 


along with an equivalent 


ibor 


Lowering the strike has a similar effect to 


Prime - SPrime Libor s 


There is a potential trap related to Prime vega that depends on the method 
used. Commonly what is done is to quote vega by “changing input vols by 1%” 
and finding the change in the option value. In both methods, the input vol is 
Libor. However, in method 1, the change in Prime vol will be less than 1%. If, 
just by a change in the semantics, we redefined “input vol” to mean the vol used 
as input to the Black model, then we would change the Prime vol by 1%. 
Therefore, it is clear that an innocuous-sounding change in the definition leads to 
a change in the risk quoted for the option". 


? The Prime Rate and the Macro-Micro Model: The constancy of the Prime rate for 
macroscopic times was the prototype for the Macro-Micro model for a macro- 
economically determined rate in the absence of trading. That is, even though derivative 
instruments on the Prime rate exist (swaps, caps) and their market prices fluctuate, the 
Prime rate itself does not fluctuate. The Macro-Micro model is described in Ch. 47-51. 


? Models, Rigor, and Clients: These seemingly crude approximations for Prime caps 
may offend the rigorously minded quant. Be aware that on the desk, empirical models are 
sometimes employed regardless of “theory”. There may be no alternative. What would be 
your theory of Prime rate dynamics? By the way, you need an indicative Prime cap price 
right after lunch, because that is when the salesman is going to call up the client. 


" Definitional Traps for Risk: Prime vega ambiguity is not the only example of why it 
is important to know the details of how the risk is defined. Once you find out, it is a good 
idea to document things and then periodically monitor the situation in case something 
changes. 
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CMT Rates; Volatility Dependence of CMT Products 


Caps and swaps are also written on CMT (Constant-Maturity Treasury) rates and 
on CMS (Constant-Maturity Swap) rates ". A CMT rate has a definite maturity 
т (e.g. т= 5 for 5-yr CMT). We need the maturity- т CMT forward rate at time 
t. There is an extra complexity, namely a state dependence is present for the 
various possible values of a forward CMT rate at a given time'^. The diagram 
gives the setup for forward CMT rates: 


Forward CMT Rate Kinematics 


FO, (t; Node, ), 
r(t; Node, ) 


5 CMT Rates and CMT Derivatives: The CMT rates are obtained by fitting the 
treasury curve, and are contained in a weekly Federal Reserve Bank “H15” report. 
Derivatives written on CMT rates are used by, for example, insurance companies that 
have products such as SPDAs that have payouts based on CMT, for example the 5-yr 
CMT. If this rate goes up, the company loses money. A CMT cap can protect against this 
risk. CMT products are also used as hedging vehicles for mortgage-related activities, 
since mortgage rates are correlated with treasury rates. CMT caps are illiquid and broker 
quotes can remain unchanged for long periods of time. 


16 Simplified CMT Models: Sometimes the complexity of the node dependence of CMT 
rates is ignored. See Hogan et al (ref). 
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A major consequence or complication is that the CMT rates are volatility 
dependent. The forward CMT rate is a composite rate that depends on the 
diffusion probability of getting to a given value at time ¢. For example, if a short- 
rate model is used for the underlying dynamics", this diffusion probability 
depends on the short-rate volatility. 

The forward CMT rate (of maturity z, at time ¢, at node а) is called 


FO, (t; Node, ). The nodes arise from a discretization of rates at each time for 
implementing numerical algorithms. 
The CMT rate FO, (t; Node, ) and the short rate r(t; Node, ) are shown. 


The discount factor Р? (t; Node, ) from maturity date / + т back to time f at 


node o is also shown along the discount factor pi) (t; Node, ) corresponding 


to an intermediate date t, . CMT rates can be built up from short rates using the 
same short-rate model used to price other derivatives, for consistency. 

This forward CMT rate 1s equal to the coupon that 1s obtained from setting 
the corresponding forward treasury coupon bond to par at the forward node a, 


namely BY) (t; Node, ) = 100. The bond BO (t; Node, ) depends on the state at 


Tsy 
time ¢ because the forward discount factors used to construct this bond depend 
on the state. There are other treasury-market related corrections". 


Explicitly, the discount factors? рі") (t; Node, ) back from times /, to time 


t at specific node о are used to construct the bond” with coupon ed ; 


Bp? 


Tsy 


(t; Node, ) = уси p” (t; Node, ) 100. Р) (t;Node,) (10.6) 


At par (i.e. 100) for the bond, Gy by definition is FO. (t; Node, ) „i.e. 


17 Short-rate Formalism: The interested reader might consult Ch. 43 for some details. 


5 Repo and Auction Complexities: Because CMT rates are connected with the treasury 
market, some complexities of repo (including forward repo curves) and auction effects 
enter CMT calculations. 


? Discount Factors: For CMT we use the treasury discount factors, for CMS Libor we 
would use Libor discount factors. 


? Kinematics: In practice for CMT and CMS, we may need to put in extra factors (e.g. a 
factor 1⁄2 for semi-annual coupons), the rate convention (e.g. 30/360, money market), etc. 
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t+T 


FE. (t; Node, ) =100- L - pe) (t; Node, J ye (t;Node,) (10.7) 


Given a model for the discount factors, in this way we obtain the CMT rates. 


CMS Rates 


To get a CMS rate we would set the appropriate forward Libor swap to par (zero 
value) at each node at time ¢ and follow the same procedure. 


A CMT Swap is a Volatility Product 


A CMT swap is considerably more complicated than the plain-vanilla swaps we 
considered in the previous chapter because of the volatility dependence of the 


CMT rate. We need to compute the rates FO. (t; Node, ) at the different reset 
times of the swaplets, and for the different nodes at those reset times. The swaplet 
contribution at a given node is proportional to the difference of FO. (t; Node, ) 


and the other rate in the swap. This other rate can be fixed at some value E or 
can be a floating rate like Libor. In the latter case, the swap is called a CMT- 
Libor basis swap. 

Because of the volatility dependence of the CMT rate, CMT swaps have 
nonzero vega. Even though there is no optionality written in the contract, because 
CMT swaps have vega they could be put in an “option” book. 


CMT Caps 
We now consider CMT caps?'. We also need the index / specifying which CMT 


caplet we are talking about", so we write FS (t; Node, ). We numerically 


determine, at the I" CMT caplet maturity ¢ and at the node а at t, if 
p І 1 


?' Lognormal CMT Rate Dynamics Comments: The CMT (or CMS) rates can be 
derived as composite rates from an underlying process as we illustrate here. Alternatively, 
these composite rates can be stochastically modeled directly. It should be noted that it 1s 
inconsistent to model a composite rate as lognormal and at the same time model the 
“elementary” rates contained in the composite rate as lognormal. This is because the any 
function — even a sum — of lognormal processes is at best only approximately lognormal. 
Naturally, this does not stop the market from quoting the composite CMT volatilities as 
lognormal. 


? CMT Rate Notation: Ugly as the notation looks, there is actually another attribute, 
namely the date to at which data are used to construct the CMT forward curve. 
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FO (t; Node, ) > Е. If so, the caplet gets a contribution proportional to the 


difference. The contributions from all nodes at А аге added up to get the / " CMT 


caplet value. The CMT cap value is the sum of the CMT caplets. 
Note that because a CMT swap is volatility dependent, CMT caps and floors 
have different volatility dependencies. 


Numerical Considerations in Calculating CMT Derivatives 


The numerical calculations for CMT products are highly numerically intensive. 
This is because of the dependence on the discretization nodes of the quantities 
needed for the pricing. Most of the time using brute-force numerical code is spent 
calculating the zero-coupon bonds at each node. Once we have the zero coupon 
bonds at a node we immediately get the CMT rate at that node, as we saw above. 
We can speed up the calculations. The basic idea is to use a fast analytic 
method based on a mean-reverting Gaussian process for calculating the CMT 
rate, including the volatility correction, at any future time and for any given 
future short-term rate. Explicitly we can use the mean-reverting Gaussian short- 


rate model for an approximation to the discount factors pi) (ї; Node, ). The 


explicit expression is given in the discussion of this model in Ch. 43. 

The actual values of the future short-term rates come from a separate code, 
for example a lognormal short-rate process code”. Hence no negative rates 
actually appear”. 


? History: I got the idea of mixing analytic and numerical methods to speed up the 
calculations for CMT products in 1994. The speedup over the brute-force lognormal 
numerical code was an order of magnitude. 


^ Gaussian Models and Negative Rates: The analytic Gaussian model gives a 
reasonably good numerical approximation for discount factors, and thus for CMT rates. 
There is some negative short-rate contribution to discount factors in Gaussian models. 
However, the short-rate grid in the CMT rate calculation is lognormal, and no negative 
rates appear in the grid. The short rate = 0 axis is in the figure to emphasize this point. 


These days negative rates can appear, so a minimum negative rate can be used. 
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11. Interest-Rate Swaptions (Tech. Index 5/10) 


We described interest-rate caps in the last chapter. A cap is a collection or basket 
of options (caplets), each written on an individual forward rate. A swaption, on 
the other hand, is one option written on a collection or a basket of forward rates, 
namely all the forward rates іп a given forward swap'. The fact that the swap 
option is written on a composite object means that correlations between the 
individual forward rates are critical for swaptions. 

Swaptions are European if there is only one exercise date, Bermudan if there 
are several possible exercise dates, and American if exercisable at any time. The 
forward swap into which the swaption exercises can be either a pay-fixed or a 
receive-fixed swap’. 

Swaptions are usually based on Libor. Swaptions on other rates also exist. 


European Swaptions 


A European swaption is an option that, at the swaption exercise date, gives the 
right to the swaption owner X to enter into a forward swap. There are two 


numbers that characterize a European swaption. The first is the time interval г 
from the value date ¢, to the swaption exercise date t. The second is the time 


interval 7 from start date t 


ИТЕ sarı Of the forward swap to the maturity date Ty 


at 


of the forward swap’. 
The picture gives the idea for European swaptions: 


' Forward Swap: A forward swap is a swap that starts sometime in the future, at бн, 
and ends at Tma , in the notation of the diagram on the next page. 


? Names for Swaptions: If the swap from the point of view of the broker dealer is a 
receive-fixed swap, the swaption is called a receiver's swaption or a call swaption. If the 
swap from the point of view of the broker dealer is a pay-fixed swap, the swaption 1s 
called a payer's swaption or a put swaption. 


? Jargon: The two numbers t* and Tswap Characterize the swaption by saying “In t* for 
Tswap ОГ “т* by Tswap . SO if t* = 3 years and tsy4, = 5 years the swaption would be 
called “In 3 for 5” or “3 by 5” swaption. Sometimes the total time from now until the end 
of the swap is used, i.e. т* + Tswap is substituted, so the swaption would be called “In 3 for 
8” or “3 by 8”. You need to check the convention used locally. 
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Swaption valued today t, for exercise at г 


into the forward swap between | tsari» Tuar | 


Forward Swap 
Exercise 
A p 
Tuar 


The market-standard formula for a European swaption price $C „ is the 
Black formula with a constant volatility. Swaption volatilities are quoted in this 
language in the market. The result is obtained by assuming that the forward swap 
breakeven (BE) rate Rpg is lognormal*'. We have, for a pay-fixed swaption with 
strike rate Е, 


$C 


„ =$X {RoN (d, )- EN (ad_)} (11.1) 
Here, 


d, = nRa B) ос (11.2) 


* 
ONT 


The kinematic factor $X of notionals and discount factors corresponds to the 
forward swap, 


* Black Swaption Formula and Consistency Issues: Jamshidian (ref. i) has proved that 
the Black formula is an exact result for European swaptions under certain assumptions. 
Physically the BE rate corresponds to a swap with zero value which costs nothing. Hence 
the BE rate is not an asset, and therefore the swaption is given by the Black formula. 
However there are consistency difficulties. Note that 1f the break-even rate is lognormal, 
individual forward rates cannot be exactly lognormal. Also, note that the break-even rates 
of all swaps cannot all consistently be lognormal. For example, the BE rates of the three 
swaps between successive time intervals (t; , tz), (6, t3), and the total time interval (t; , ts) 
are related and cannot be independently lognormal. 
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$SX= OW BN adi РИ (11.3) 


swap Arrears 
1 € FwdSwap 


The discount factors serve to discount the in-arrears cash-flow payoffs in the 
forward swap back to the value date, today ¢,. 


Note that deep in the money R,, >> Е, the pay-fixed swaption approaches 


the pay-fixed forward swap. This means that exercise becomes highly probable if 
the swaption holder X can exercise into a swap with a large positive swap value, 
to X 's benefit. 


Put-Call Parity and European Swaptions 


We have a relation between options that follows simply from the fact that the 
sum of all probabilities of all paths starting at a given point has to be one. This 
becomes the statement of put-call parity for European options’. The probability 
sum rule is translated into the statement about the normal functions 


N (x) +N (-x) =1. We illustrate the result for European swaptions: 


= $ (11.4) 


Payers Swaption ` C Receivers Swaption Pay-Fixed Swap 


This equation also holds for the hedges (А ‚у , vega). 


European Swaption Volatility 
In general, the volatility for a composite rate (the BE rate) will be less than the 
volatilities for individual rates. This is because correlations are less than perfect 
and some diversification occurs. The situation can be thought of similarly to the 
volatility for a basket of equities. The volatility of the basket of equities is 
lowered because some stock prices can go up while others go down. In a similar 
fashion, some forward rates can go up while others go down, keeping the BE rate 
relatively constant or less volatile. 

In particular, if p, is the correlation between forward rate returns, if all 
dynamics are lognormal, and if we ignore the (relatively small) rate dependence 
of the discount factors, then the lognormal BE volatility o,, is approximately 


Pus > Э отару, (11.5) 
LU 


5 No Put-Call Parity for American Options: Because of the complex nature of exercise, 
although the sum rule for probabilities of course remains, no simple relation exists for 
American or Bermuda options. 
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Here, a sort of “component weight" ¢, is given in terms of the ratio of the time- 
averages of the / " forward rate in the forward swap and the BE rate. This time 
average is indicated by brackets is It is to be taken over the same time 


window as that defining the volatility and correlations. To get this formula, 
factorization approximations were used for ratios of averages and for products of 
averages. Explicitly, 


uh ) і $N, 3 dt wap 3 Pp 


2 = Arrears (1 1.6) 
(Ry): $X 


Hedging European Swaptions: Delta and Gamma (The Matrix) 


European swaption A risk 1s hedged using the same techniques as previously 
described using ladders. Note that the A ladder will only be significant in the 
period of the forward swap. Before the start of the forward swap, small residual 
A effects are present from the discount factors. 

European swaption у risk needs to take into account the fact that gamma is a 


matrix yp. The approximate factorization relation |у | = ууу | 


mentioned in the section on swaps works reasonably well for swaptions to 
include the effects of the off-diagonal terms. We emphasize again that this only 
holds deal-by-deal, i.e. for each swaption separately, not for a portfolio. 


A Paradox, a Paradox, a Most Ingenious Vega Paradox? 


European swaption vega risk is tricky and involves a little paradox. On the one 
hand, if the swaption is exercised all volatility dependence disappears. This is 
because the swap obtained by exercising the swaption has no volatility 
dependence. This would imply that the vega should be concentrated in the vega 


bucket containing the swaption exercise time f . On the other hand, the forward 
rates f, individually making up the forward swap associated with the swaption 


live at times f, after t . Hence this argues that the sensitivities to the individual 
forward rate volatilities с, should be spread out in the buckets at times ¢, after 


t . The paradox therefore is how to construct the vega risk report’. 


* Musical Reference: Listen to the trio of Ruth, Frederic, and the Pirate King in The 
Pirates of Penzance, Gilbert and Sullivan (No. 19). The paradox in the operetta has to do 
with whether birthdays should appear or disappear, relative to Frederic’s ability to 
exercise an option to leave the pirates and subsequently exterminate them. 


7 Vega Paradox: There is no good way out of this paradox. Different desks report vega 
using either of the two methods presented in the text. If the procedure spreading out vega 
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Bermuda/American Swaption Pricing 


A Bermuda swaption has a discrete exercise schedule (usually every 6 months 
after a lockout period during which no calls are allowed). Swaptions and callable 
bonds are closely related. A *10NC3 bond” means a 10-year bond that is callable 
after 3 years. The call option embedded in the bond corresponds to a Bermuda 
swaption, which can be exercised after 3 years. The exercise prices usually vary 
with time in a schedule. For a non-amortizing swap, we have seen that we can 
rewrite a receiver’s swap as a long bond and a short FRN. Thus on a given 
exercise date, the schedule might specify that ABC would need to pay 102 for the 
bond originally at par (100). 

American swaptions, which can be exercised at any time (perhaps after a 
lockout period), are less common than Bermudas largely because of the close 
relation of Bermuda swaptions with callable bonds. 

Bermudas/Americans are priced with a dynamical stochastic rate model, 
using the same backward back-chaining algorithms used to price callable bonds. 
Recently American Monte Carlo has been used. Typically, a short-rate diffusion 
process can be used". A description is given later іп the book (c.f. Ch. 44). 


Determination of the Local Forward Term-Structure of Volatility 


In order to determine the local forward term structure of volatility for the 
Bermuda algorithm, we proceed using the normalization to the European 
swaption market. This involves the pricing of a number of European swaptions 
using the Bermuda algorithm, and varying the local short-rate forward volatility 
in the Bermuda algorithm until agreement with a grid of European swaption 
prices is obtained. This procedure has to be watched carefully, as it can become 
numerically unstable. In particular, cutoffs need to be imposed such that the local 
volatilities do not become negative, unphysically small, etc. The underlying 
cause of this sort of problem is that the European swaption market prices are not 
completely internally consistent, viewed from the Bermuda model perspective. 


over the period of the forward swap is used, the risk system must be clever enough to 
drop all the risk at the swaption exercise date. Far from exercise, it seems more 
reasonable to use the spread-out vega method, but close to exercise, it is probably better 
to use the concentrated-vega method. 


* Other Rate Processes for Bermuda Swaption Pricing and Job Possibilities: There is 
a variety of short-rate processes available. The most common “Street-Standard” is 
lognormal. A Gaussian model with one or two factors plus mean reversion is also 
popular. Multifactor models can also be used. Transformations of hidden variables into 
physical short rates can be employed. There would seem to be employment possibilities 
for quants here that could last for years. 
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Bermuda/American Swaption Hedging 


Vega for a Bermuda/American swaption book is hedged using other volatility 
products. Because the algorithms are normalized to the European swaption 
market, and because in some sense Bermuda options are a mélange of European 
options shielded from each other by the complex exercise logic, European 
swaptions are commonly used as approximate hedges. These hedges must be 
modified as time progresses’. 

Hedging depends on the shape of the yield curve and the shape changes. This 
is because as the yield curve shape changes, the probability of early exercise 
changes, so the European hedges need modification. 

Vega for deep ITM swaptions is small and hard to evaluate. Oscillations in 
vega for discrete codes can be observed as a function of the amount of volatility 
change ôo . These oscillations are typical of numerical noise. 

A significant potential problem is that the Bermuda (and American) 
swaptions are quite illiquid. Therefore in a stressed market, a Bermuca swaption 
seller can suffer substantial losses that cannot be hedged away. 


Swaption Delta Hedging and Numerical Noise 
Delta A and gamma y are hedged using numerical differences derived from the 


algorithms. The magnitude of the change of the curve ór to get these differences 
needs to be chosen carefully in order to avoid spurious numerical instabilities for 
the discretized algorithm; in particular 67 cannot be too small, certainly more 
than Ibp'. At least 10 bp should be used, and 50 bp has been used. 

The A ladder for Bermuda/American swaptions is tricky, especially at the 
short end. Occasional instabilities can be observed where neighboring delta 
buckets have canceling large fluctuations, indicating small overall risk but 
spurious differential calendar risk''. These instabilities are worse with deals near 


? Acknowledgement: I thank Ravit Mandel for an informative conversation on practical 
aspects of swaptions. 


10 Oscillating Convergence: As бг is increased from 1bp, an oscillating convergence сап 
sometimes be seen for A as a function of дг. Again, as with vega, the phenomenon 
observed here is typical of numerical noise. The amount of oscillation also depends on 
the type of curve shifted. 


1 Spurious Numerical Instabilities and Management Meetings: The presence of these 
occasional numerical instabilities can become a concern to nervous management. One 
expedient is just to smooth out the fluctuation by hand in the report and wait for the 
problem to go away tomorrow. Alternatively, the fluctuations can be smoothed out in the 
code. Of course the grid in the algorithm can be further refined, but this may take 
considerable time and detract from other activities. Such instabilities naturally provide a 
great opportunity to go to meetings to discuss the whole thing with everybody. At the end 
of the day, if the management is too wrapped up in focusing on spurious noise issues with 
little real risk, considerable time can be wasted. Some numerical noise will be present no 
matter how much the code is refined. Still, the quants should monitor numerical noise. 
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exercise. This is partially because then only a few nodes of the algorithm can be 
used to determine the swaption. Instabilities are also magnified by anomalies in 
the cash rates, in particular if the short-end cash curve becomes inverted, e.g. the 
1M cash rate is above the 6M cash rate, a situation that does sometimes occur. "? 


Delta and Vega Risk: Move Inputs or Forwards? 


Delta Risk — Which to move, swap rates or forward rates? - Illustration 


Forward rates change wildly if 5-yr swap rate is 
increased, while holding other inputs fixed. 


Big upwards 
jump 


Negative 
moves 


Futures fixed 


? The noise is also increased if the curve generation itself leads to discontinuities in the 
forward rates, that is, if the forward rates are not smoothed out in the curve-generation 
algorithm itself. 


160 Quantitative Finance and Risk Management 


In the above picture, we move the 5-year swap rate, holding other input swap 
rates fixed. We get wild forward rate fluctuations and big changes in swaption 
exercise probabilities. 

Say the forward curve is generated by 4 years of futures along with the 5, 7, 
10, etc. year swap rates. Now consider as an example moving the 5-year swap 


rate К; ,,, up say 10 bp, holding all other input rates (futures, other swap rates) 


fixed. Because the futures are held fixed, the forward rates out to 4 years are 
fixed. The only way to get OR, = 10рр is if the forward rates between 4 and 5 


5-yr 
years increase much more than 10 bp, roughly 50 bp. A pay-fixed swaption with 
a first exercise time at 4 years and forward swap between 4-5 years can become 
be highly effected. If this swaption starts out 50 bp OTM, it will become ATM 
under this scenario. Other swaptions can also be affected. 

On the other hand, if individual forward rates f, are increase by the nominal 


10 bp, the swaption 50 bp OTM will still be 40 bp OTM after the shift and 
exercise will not be affected much. "° 

An important point is that even though the 5-year swap is a hedge vehicle, in 
real markets swap rates and futures are highly correlated. Therefore while 
moving the 5-year swap rate seems like a good idea, it does not correspond to 
realistic market moves and it can lead to anomalous swaption behavior". 


Vega Risk — Which to move, swaption vols or forward vols? 


A related conundrum involves the calculation of vega risk. Suppose that we ask 
for the response of a Bermuda swaption to moving one input volatility, say a 
3-year volatility, keeping other input volatilities constant. The calculated local 
forward volatilities must change in a wild manner (similar to the picture above) 
in order to achieve these input vol changes. Again, apparently simple input 
changes can lead to anomalous behavior. 


Swaptions and Corporate Liability Risk Management 


We recall from the discussion of swaps that corporation ABC had issued a fixed- 
coupon bond and then entered into a swap with a broker-dealer BD to receive 


P Kinks: A tricky subject is kinks produced by multifactor yield curve YC models, 
described in Ch. 48. It would be interesting to see if YC kinks are produced by 
multifactor forward rate models, both in forward rate space and in swap YC space. 


14 Discussions of the Delta: Many discussions can occur regarding these considerations. 
This is not discussion of noise but substantive discussion of procedure in risk probing 
which is important in principle. It is sometimes difficult for people to grasp the issue of 
why something that seems so simple (“just move a swap rate") is in reality quite subtle. 
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fixed and pay floating. If the bond is callable and market rates drop enough, ABC 
may decide to call the bond and then reissue debt at the more favorable current 
low market rates". In this way, ABC is acting to minimize its liability risk. 
Consider the following diagram: 


Reversed Swap from Swaption 


ABC Corporation Wall Street Broker- 


Dealer Swap Desk 


In tandem, ABC will want to cancel the swap at the same time. However, this 
is only possible if the swap contract has cancellation provisions. So to anticipate 
this, ABC has already placed into the swap contract the possibility to cancel the 
swap at the same times as in the bond call schedule. In this way, the bond and 
the rest of the swap disappear at the same time. This cancellation mechanism in 
the swap can be viewed as the exercise of an option (swaption) to enter into a 
forward swap to pay fixed and receive floating, canceling all cash flows of the 
original swap after the time of exercise. 

From the point of view of the broker dealer, the swap has a receive-fixed 
swaption, also called a receiver's swaption or “call swaption " (the latter because 
ABC has in tandem called the bond). 

In the section on swaps, we mentioned that a non-amortizing plain-vanilla 
swap can be written as the difference of a non-callable bullet bond and an FRN, 
and that at a swap reset date the FRN is at par, FRN =100. We now show that 
the same relation holds for cancelable swaps and callable bonds, with the fixed 
swap rate being equal to the bond coupon. 


5 Decision of ABC to Call the Bond and Refinance: A decision of ABC to call the 
bond and refinance is not made lightly. There are a number of practical considerations, 
including costs of issuance, the market for new bonds in the particular sector 
corresponding to ABC, changes in the ABC credit rating that might have occurred since 
the last issuance, etc. In practice, ABC considers: (1) the discounted savings realized by 
refinancing, measured against (2) the loss of the option of the called bond. A threshold 
for the gain has to be there just for the extra work involved. 
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If the swap is cancelable, an option to stop the swap exists. If the bond is 
callable, an option to stop the bond exists. Assume that the dates for exercise are 
the same for the swap and bond, and that these dates are at swap resets. At a 
given reset/exercise date, there is no difference between exercising the two (bond 
and swap) options, because at exercise the forward swap and forward bond just 
differ by a par FRN. In summary, at reset/exercise times the two options (swap 
option, bond option) are equal, so the cancelable swap and the callable bond 
differ by the FRN. For the receive-fixed swap in the example, this condition at 
resets/exercises is 


S Cancellable Receive-Fixed Swap = Beattable Bond 100 (1 1.7) 


Practical Example: A Deal Involving a Swaption. 


Here is the diagram of the deal: 


Receivers Swaption Deal 


ABC gets cash 
$C,, now, sells 


Broker-Dealer 
buys swaption for 


price $C, 


swaption 


Future 
Floating Debt 
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An епійу' ABC announces that it will take bids on a plan to get an 
immediate amount of cash". ABC asks various broker-dealers (ВР) to bid" to 


pay $C sw, POW to ABC for a Bermuda receivers swaption giving one BD the 


right to begin a swap with ABC at one of a set of designated times in the future. 
This forward swap is for BD to receive a fixed rate Е and pay a floating 
rate. ABC will pay an above-market rate Е in the swap, implying a cheap 


swaption price $C. , paid by BD. 


In strategic terms, ABC thinks rates are low now and thus Е is relatively low 
(even given the extra spread), making it desirable to act now. BD gets the extra 
spread. Thus, the deal can appear attractive from both sides. 

ABC may regard the swap as preferable to issuing new debt now. This could 
be for a variety of reasons. First, outstanding ABC debt may not be callable for 
some time, and ABC may not want additional debt (for example ABC may not 
have the cash for the additional payments). Second, if ABC is a municipal 
authority, there are federal limits on bond refundings. 

ABC can arrange the exercise dates on the swaption to match the call 
schedule of existing bonds, which form new potential bond refunding dates in the 
future. Actions are synchronized. If BD does exercise the swaption, then ABC 
can issue floating rate debt canceling the floating side of the forward swap. If 


BD does not exercise the swaption, АВС keeps the $C „ cash and just 


continues the current bonds. 
"Net-net" as they say, after the dust has cleared, ABC winds up paying a 


reasonable fixed rate and gets the desired up-front cash $C... regardless of the 


scenario, exactly as it wants. 
BD buys the Bermuda swaption at the low price $C 


we? Which translates 
(through the BD pricing model) into an equivalent low volatility. Theoretically, 
even though the Bermuda swaption involves an extra premium relative to 
European options, the Bermuda volatility can be lower than the European 
volatility in practice, because ABC wants to get the deal done. 

On the other hand, there is a liquidity issue. Once purchased from АВС, BD 
cannot sell the Bermuda swaption because there is no inter-dealer market for it. 
Thus, BD simply holds it, hedging it approximately with European swaptions 
that can be sold at market volatility, thus "locking in" a profit. The profit for BD 
is in return for providing the service to ABC as described above. 

The initial analysis by BD goes like this. If rates do not go down, BD will 
not exercise its long receivers Bermuda swaption. In this case, from the point of 


'° ABC: In fact ABC was the Metropolitan Transportation Authority in New York. The 
announcement was in 1999, See Bloomberg News (ref.) 


17 Beauty Contest: The activity described in the text is called a “beauty contest". The 
winner is whichever broker-dealer comes in with the best deal for ABC. 
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view of BD, the premium $C,,, is lost. On the other hand, through judicious 


t 
hedging with short positions in European receivers swaptions that are relatively 
liquid, the $C,., premium along with a profit can be made back because in this 


swt 
case the short European receiver swaptions will gain value. If rates do go down, 
BD will exercise the Bermuda swaption, remove the hedges, and pick up the 
above-market fixed coupon Е from ABC. 

In addition to collecting the above market rate including the profit on the 
spread, BD might think that rates will decrease further, making the floating 
payments to ABC even lower than predicted by the forward rate curve on which 
the current pricing is based. Such "what-if" analyses would be based on trader 
market judgments involving macroeconomic or other considerations. 

The relative risk-reward considerations leading to a bid by BD of a specific 
amount $C. for a given rate E involve other aspects. Standard considerations 


swt 
of BD include (high) costs for the traders, overhead costs for systems and reports 
for keeping track of the risk of the swaptions, transaction costs involved in the 
hedging, etc. Moreover, the swaps desk may have confining risk limits on the 
amount of volatility exposure to Bermuda swaptions that the management allows. 

One low-probability but high-impact risk is that somehow BD will be forced 
to sell the Bermuda swaption in a hostile environment (e.g. the BD company 
decides to get out of the interest rate swaptions business) and thus loses the profit 
and more. Such examples have happened in the past. 


Miscellaneous Swaption Topics 


Liquidity and Basis Risk for Swaptions 


As mentioned already, Bermudas and Americans are highly illiquid in the 
secondary market. Thus, considerable basis risk exists with European volatility. 
Typically, limits will be set depending on the risk tolerance for this basis risk. 


Fixed Maturity vs. Fixed Length Forward Swap 


Generally, the forward swap arising from swaption exercise has a fixed maturity 
date. Sometimes the forward swap lasts for a fixed time period after the date of 
exercise. For European swaptions, this is the same thing. For Bermuda or 
American swaptions, it is different because there are several possible exercise 
dates. 
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Caplets and One-Period European Swaptions 


A one-period payers swaption (e.g. exercising into 6M Libor at a given date) is 
theoretically the same as a caplet, since the definitions coincide. However, for 
CMT the equivalence is not exact. With some assumptions regarding the absence 
of volatility in the discount factors, the equivalence in the CMT case can however 
be shown to be approximately true. 


Advance Notice 


There is an advance-notice feature, where the swaption holder announces, before 
the actual exercise date, that he is going to exercise the swaption. This period is 
usually 30 days. The uncertainty of exercise is thus eliminated before the exercise 
date, and the advance notice stops the diffusion process. 


Skew for Swaption Volatility - the Vol Cube 


In previous chapters, we have discussed skew for equity and FX options 
extensively. Swaptions also have skew complications. The price of a swaption 
has an extra skew dependence on the strike or exercise rate E . Specifically, if a 
single volatility c is used to price two swaptions identical except for having 


different strike rates E, and £E,, the model prices C, апас, are not exactly 
market prices. To deal with this, the volatilities are made a function of the 
exercise rate to get an equivalent effective volatility, c(E ). Skew here is the 


same idea as for equity or FX options. Really all that is going on is that a simple 
model is adjusted to reproduce market prices for the securities by modifying the 
model parameters. Essentially, the model really just serves as an interpolation 


scheme using с(Е) : 
Because volatility as quoted in the broker market is LN (lognormal) as 
described above, we can use o, (E ) with the Black formula. This o, (E ) can 


be either a parameterized function or a numerical look-up table with 
interpolations between the (sometimes sparse) market-implied data. In order to 
reproduce the market values reasonably, several parameters may have to be used. 

The *Vol Cube" is the collection of swaption vols in the three relevant 
dimensions (x, y, Z): x = Swaption exercise time, y = Length of swap into which 
swaptions exercise, and z = Strike'®. 

Another procedure is to change the process from LN to “not-LN”. The skew 
effect as viewed from the LN model is replaced by the different dynamics from 
the not-LN model. For example, the “Lognorm Model" mixes LN and normal 


5 The Swaption Vol Cube: In practice, the data are scarce especially for the strike 
dependence of swaption vols. Therefore approximations or substitutions may have to be 
made. 
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(Gaussian) processes. Because the rate changes and rate paths are different than 
in the LN model the probabilities of exercise become modified. Therefore, with a 


constant volatility c, зу for this “not-LN” model, we might hope to get 


market prices. 

The behavior of swaption skew can be inferred qualitatively. Say that the 
process that reproduces skew is some mix of Gaussian and lognormal. Because 
Gaussian rate probabilities are not suppressed at low rates, more Gaussian paths 
will go near r =0 than in the LN model. For this reason, the equivalent LN 
volatility increases near r = 0. Therefore, for low strikes that probe the low-rate 
region, the effective LN volatility increases. In summary, the LN equivalent 


volatility with skew, o, (E ) , increases if strikes E decrease. 


Since far-from-the-money swaptions do not trade regularly", model prices 
are used. Sometimes quants believe that their models give the "correct" prices 
when there are no market quotes—or sometimes even when the market quotes 
differ from the model”. 


Model Dependence of Vega and Risk Aggregation Problems 


Because volatility parameters in non-LN models are not standard LN volatilities, 
the volatility dependence (vega) for risk management is also model dependent"! 
This difference in vega between models for the same Bermuda swaption can be 
on the order of 10%. This can lead to aggregation problems for volatility risk 
between desks that use different models. 

One way of avoiding such aggregation problems is to demand that desks 
uniformly carry out the same risk procedure. This procedure could be to move the 
input LN volatilities for European swaptions by 1%, revalue the whole swaption 
book using whatever non-LN model the desk has, and then take the difference of 


? Far From the Money Options: Such options do not trade regularly, and so market 
quotes are often not available. In a sense, this 1s not terribly critical, since 1f an option is 
far ITM it will probably be exercised with a payoff independent of the volatility, while 1f 
it is far OTM it is not worth much regardless of the volatility. For transactions of far 
OTM options, a "nuisance" charge is sometimes applied because it costs money to 
monitor the option in addition to the small option value. A nuisance charge cannot be 
modeled by skew. 


20 Aristotle's Horse and Correct Models: While it may be difficult to obtain market 
quotes for illiquid instruments, it also puts the egoistic modeler in the same boat as the 
possibly apocryphal story about Aristotle. The story is that Aristotle theoretically (and 
incorrectly) deduced the number of teeth in a horse, and then refused to look at a real 
horse to count the teeth because he said his idea had to be correct. The decisive test for a 
model is whether the model price is near the price received if the option is sold. 


2 Misunderstanding Vega: The fact that vega is model dependent can lead to 
misunderstandings. Comprehension of the technical fine points of volatility in 
complicated models outside a quant group is generally shaky. Therefore, the problem can 
exist but go unnoticed. 
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the revalued book with the value of the book using the original input LN 
European swaption volatilities. 


Theta 


The measurement of theta Ө (time decay) for swaptions can be monitored. 
Revaluations can be done one day apart. In performing the revals, it must be 
specified whether rates are held constant or whether the rates are allowed to slide 
down the yield curve, as mentioned in previous chapters. 
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12. Portfolios and Scenarios (Tech. Index 3/10) 


Introduction to Portfolio Risk Using Scenario Analysis 


In this chapter, we introduce the analysis of portfolio risk using scenarios. We 
first consider the definitions of portfolios and of scenarios. We discuss scenario 
types. These include instantaneous scenarios and scenarios in the future. We 
discuss a simple scenario simulator. We give a few comments on presentations. 
More elaborate calculations of portfolio risk are discussed later in the book, 
including VAR or Value at Risk (Ch. 26-30) and counterparty risk (Ch. 31). A 
sophisticated framework called “Smart Monte Carlo" or SMC is in Ch. 44. 


Definitions of Portfolios 


The definitions of the portfolios depend on the goals. Among the variables are: 


Product 

Level of Granularity 
Level of Aggregation 
Type of Risk Analysis 


These variables are all inter-related. An individual product can mean, for 
example, equity barrier options on the equity OTC derivatives desk, Libor swaps 
on the interest-rate swap desk, non-investment-grade bonds on the high-yield 
desk, CMOs on the mortgage desk, etc. A more refined level of granularity can 
mean, for example, Libor swaps with maturity only between 5 to 10 years. A 
high level of aggregation can mean, for example, the total vega in the Fixed 
Income Department across all desks. Many types of analysis can be done: at 
different aggregation or granularity levels, with different types of scenarios, with 
various reports, etc. 

For a single product level on a desk, some refined analyses can be run that are 
prohibitive at an aggregate level. At an aggregate level, approximations may have 
to be made due to time and programming resource constraints. At the firm-wide 
level, the collection of risk measures can be a major, lengthy undertaking. For 
any aggregation, care should be taken systematically to collect enough 


169 


170 Quantitative Finance and Risk Management 


information so that subsequent analyses using various data-view cuts of interest 
for different reports can be accomplished. 

A special case is a single deal under consideration that needs to be analyzed 
for risk. Deal risk is also considered from the vantage of possible diversification 
of risk in larger portfolios’. 

Here is a simple portfolio of three hypothetical interest-rate derivative deals 
used in my CIFEr tutorial. The portfolio is called “Port 1”. The definitions of the 
Greeks are the same as in the earlier chapters. The input curves, volatilities, etc. 
used for valuation and risks are from the date given. Each deal has its own file in 
which the specifics are listed, including the definition of the deal, the curves used 
for pricing, the hedging results, etc. 


Portfolio Definition 
Port 1 
4/1/00 


Type Value $000 Delta Gamma 
Swap $3,163 1,656 
Cap $4,301 1,303 
Swaption $1,362 684 
$8,826 3,643 


All such information from portfolios should be stored in an appropriate 
database". 

In addition, we can look at the risk ladders for each deal, as explained in 
previous chapters. The total ladders summed over deals would be examined to 
determine hedging requirements for the portfolio. 

Realistic portfolios can have dozens, hundreds, or thousands of deals 
depending on the situation. 


' One-Off Analyses: The alternative to systematic data collection can be the syndrome 
of frantically gathering information to answer a specific “one-off” management question. 


? Economic Capital: See the chapter on Economic Capital for a discussion of the 
standalone and diversification risk issues. 


? Message to Systems: Please Store Both Prices AND Hedges in the Database, OK? 
Historical series of both prices and hedges of deals in a portfolio are desirable in order to 
answer certain risk questions. The historical input curves should also be saved or 
archived. Early defective design decisions in the data model can have unforeseen 
restrictive consequences later. Bad decisions are sometimes made. 
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Definitions of Scenarios 


By a scenario is meant a potential or hypothetical change in underlying variables. 
Some major considerations for scenarios are: 


e The Scenario Date 
e Variables in the Scenario 
e Туре of Scenario (Postulated, Historical, Statistical) 


First, the scenario date needs to be specified. Often, this is just taken as 
today’s date for an “instantaneous” change in the underlying variables. However, 
more information can be obtained by looking at scenario dates in the future. 

The variables in the scenarios can be interest rates, volatilities, equity indices, 
specific stock prices, commodities, etc. The variables used depend on the 
situation and the resources available. 

Here is an example of a scenario that called Scenario A (or Scen A for short), 
expressed in terms of the forward 3-month Libor rate changes. 


Scenario A 


—e— d(FwdRate bp) 


120 
100 


012 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 


Forward Time (Yrs) 


The scenario in the graph assumes that forward rates will drop 20 bp for a 
certain time and eventually increase up to 100 bp in 20 years’. Therefore, this is a 


^ Long-Term Scenarios? Naturally, no one has any idea of what the world will look like 
in 20 years. Nonetheless, a 20-year swap on 3-month Libor has forward rates going out to 
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yield-curve steepening scenario (the short-maturity end drops and the long- 
maturity end rises). The way the scenario is generated is that the underlying curve 
input parameters (from futures, treasuries, swaps spreads) are changed by certain 
amounts, the new curve is generated, and the difference is taken with respect to 
the old curve. This particular scenario assumed no changes in volatility. 

Various types of scenarios exist. A scenario can be postulated (e.g. interest 
rates up 1006р). A historical scenario uses information from historical moves 
(e.g. spreads blowing out in the roiled markets in 1998). A theoretical statistical 
scenario uses a model (e.g. a path from a discretized multifactor model with 
random correlated fluctuations about forward curves). 

Naturally, various opinions exist regarding the utility of these possibilities for 
scenarios. Information overload is a definite consideration with many scenarios. 
People just cannot absorb too much information. Still, myopic risk tunnel vision 
can result from a limited sample of scenarios (or no scenario analysis at all). 

Resource and budget constraints often wind up severely limiting the number 
of scenarios run in practice. 


Changes In Portfolios Under Scenario Assumptions 


Here are the changes in the same portfolio (Port 1), assuming that the changes in 
the forward rates occur, as given by scenario A: 


Change in Portfolio 
Port 1 Scenario: ScenA 
4/1/00 Scen. Date 4/1/00 


Type Value $000Delta Gamma 

Swap -$518 -1 
Cap -$388 50 
Swaption -$222 
-$1,128 


We see that under the steepening assumption of this scenario, all deals in Port 
1 lost money*. Note that the hedges changed too. This scenario was instantaneous 
(the scenario date was the same as the date the analysis was done). 


20 years. Note that not making a scenario at 20 years effectively means you assume that 
the Libor rate will be unchanged 20 years from now, which is itself a scenario. 


5 Wild scenarios? At an offsite risk department meeting in the early 2000's I asked 
everybody to list “worst risk” scenarios. One scenario involved a terrorist attack on, as I 
remember it, facilities in Lower Manhattan. In the voting on the “top ten” risks, this 
scenario did not make the finals. 


° My Favorite Job Interview Question: Although I don’t like to give pressure 
interviews to job candidates by asking them technical riddles and watching them freeze, I 
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Many Scenarios with a Fixed Portfolio 


The same sort of analysis can be done for the given portfolio with a collection of 
scenarios Scen A, Scen B ... Scen Z. The results for the individual changes in the 
portfolio can simply be tabulated in a report like this: 


Port 1 


Port 1 Change 
$dP,(A) 


$аР+(В) 


$аР+(2) 


Scenarios with Weights 


Aggregation across scenarios for a fixed portfolio can be done by assigning 
probability weights to the scenarios. So Scenario A can have weight w(A) etc, 
with the weights summing to one. The weights might be fixed by hand using 
judgment. Alternatively, if a mathematical model is used for generating the 
scenarios, the model will specify the weights. Given the weights, we can also 
sum over the changes to get the average change. A report might look like this: 


Weight Weighted Changes 
W(A) SdP.(A) 
SdP.(B) 


w(Z) $dP4(Z) 


Sum $dP.(All) 


do usually ask one question. If the candidate doesn't know the answer, I just watch the 
reasoning process. Here's the question: What happens to the value of a cap when the 
curve of forward rates steepens? Naturally you know that the value of the cap goes up 1f 
rates go up because the cap, being insurance against rising rates, then has more 
probability to pay off. So why did the cap in the example Scenario A, a steepening 
scenario, lose money? 
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Many Portfolios and Scenarios 


We can consider a variety of portfolios and scenarios. Here is a possible set up 
that might occur in a system. We have put in a little more detail. 


Run File for Several Portfolios and Several Scenarios 


Today's date 4/1/00 

Scenario date 4/1/00 

Filename Instantaneous.doc 

Portfolio Want Port? Book Scenario |Want it? Weight % 
Port 1 Y Book 1 Scen A Y 25% 
Port 2 N Book 1 Scen B Y 25% 
Port 3 Y Book 7 Scen C Y 5096 
КЕ ы Ней Scen Z N 0% 
Port N, N Book b(N,) 


We have N, portfolios. Each portfolio is in a “book” for aggregation. We can 


either include the portfolio or not in the run using the flag Y or N in the second 
column. For each scenario, specify whether we want to run that scenario, and put 
in a weight. All portfolios specified are included for each specified scenario. 


A Scenario Date in the Future 
The next set up is for a scenario in 10 days. Notice that the options allow 
changing the portfolios and/or scenarios as time progresses. We can also specify 
if we want to produce reports. 


Second Run File for Scenario Date in Ten Days 


Today's date 4/1/00 

Scenario date 4/10/00 

Filename |Ten days.doc 

Portfolio Want Port? Book Scenario, Want it? Weight % 

Port 1 N Book 1 Scen A N 096 
Port 2 N Book 1 Scen B Y 3396 
Port 3 Y Book 7 Scen C Y 3396 
Е sais ux ScenZ Y 33% 
Port №, Y Book b(N,) 
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Instructions, Options 
Slide down fwd curve? Y 


Produce reports? Y 


Sliding or Not Sliding Down the Forward Curve 


We can either “slide down the forward curve” for rates or not (see the box near 
the bottom). First, consider just moving the time forward but leaving the rates 
“fixed”. We actually have two possibilities for keeping rates “fixed” as time 
changes from “today” to the scenario date. Either the forward rate at a given date 
remains fixed, or the forward rate at a given time interval from the value date 
remains fixed. The latter possibility is called “sliding down the yield curve” 
because a rate at a given date after the time shift will be closer to the origin than 
the rate at the same given date before the time shift. 

We can apply the same idea for a scenario. Having specified the scenario 
forward curve, we can slide down the curve to define the final scenario, or not. 


A Scenario Simulator 


In the above analysis, we kept the date fixed for each scenario run. We can now 
construct a scenario simulator in which we successively change the dates. Here is 
a setup where we run the instantaneous, ten-day, and one-month scenarios. 


First Simulator Run File 
Filename Sim_1.doc 


Date Want it? 
4/1/00 Y 

4/10/00) Y 

5/1/00 Y 

N 

N 


6/1/00 
7/1/00 


Risk Analyses and Presentations 


Given the results, we perform various statistical analyses from the reports 
generated by the system. We might look at averages, worst cases, confidence 
levels, plot histograms, etc. We can compare the results obtained from this 
analysis and compare what we got in the last analysis (e.g. last quarter). This is 
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useful for trend analysis and strategic planning. Finally, we prepare presentations 
and communicate the results to the management’. 


7 Presentations and Communication: For upper management, we need to summarize at 
a high level. We probably want to use PowerPoint with large font and a color printer. 
Details of the great analyses we did will probably just go into an appendix of a white- 
paper handout for reference. Please go back and reread the Amusing but Dead Serious 
Practical Exercise at the beginning of this book, and maybe even do the exercise! 


PART III: EXOTICS, DEALS, AND CASE STUDIES 
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13. A Complex CVR Option (Tech. Index 5/10) 


This chapter contains a case study of a complicated equity option called a CVR 
that was an important part of an M&A deal. Although the events happened long 
ago, some of the analysis is quite general and could be relevant in other contexts. 
The CVR will be considered in some depth in order to give an idea of the 
complexity that sometimes occurs ^'. Many of the topics in this chapter are quite 
general. A variety of interesting theoretical points arose while pricing the CVR. 
These included conditions under which an option will or not be extended in time. 

We use the present grammatical tense in order to dramatize the situation as it 
unfolded at the time. Letters ABC, DEF, XYZ are used generically to describe 
the players. Various specific topics are described in the footnotes. 


The M&A Scenario 


The ABC Corporation wants to acquire DEF Corporation. ABC is willing to pay 
the DEF stockholders cash along with ABC stock. From the point of view of 
DEF, the risk is that the ABC stock could decrease in value. This might happen 
for any of a number of reasons. ABC may have to issue extra debt, leaving it in a 
weakened condition. Moreover, ABC could be in competition with another firm 
XYZ to purchase DEF and thus might be forced to pay a high price. This in fact 
happens, with a fierce bidding war. To counter the XYZ bid, ABC after hesitation 
decides to offer DEF stockholders a sort of put option on ABC stock with certain 
side conditions". Thus if ABC stock does go down, the DEF stockholders are 
compensated within certain limits, making DEF more willing to go through with 
the ABC proposal" *. 


' Which Deal? This high-profile M&A deal was front-page news around 1994. In fact, 
ABC was Viacom, DEF was Paramount Communications, and XYZ was QVC Network. 
See Refs. 1. 


? Put Option Structures: These are generically called “price-protection collars”. 


> The CVR Option and the Deal: A complex extendable put option on the Viacom 
stock, called a CVR, was offered as a last-minute inducement to Paramount stockholders. 
It clinched the deal. CVR is short for *Contingent Value Right". The conditions defining 
the CVR were public record, and the CVR later traded. 
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CVR Starting Point: A Put Spread 


From ABC’s point of view, a put option is a risky object with potentially quite 
expensive consequences. Therefore, side conditions are imposed on the put 
option restricting the payout. These side conditions boil down to requiring the 
DEF stockholders to give back some optionality to ABC. In the first 
complication, DEF gives back to ABC a less valuable put option, which has a 
strike price lower than the strike price of the put option given to DEF from ABC. 
Thus, including both put options (one long and one short); ABC gives DEF a put 
spread. The put spread starts to pay off to DEF linearly if the ABC stock 
descends below the upper strike. If the ABC stock descends below the lower 
strike, the payoff is held constant. In summary, in this first approximation, ABC 


gives DEF stockholders a put spread which exercises at time 5 А 


СУК Extension Options and Other Complications 


Adding more complexity, ABC also reserves the right to extend the put spread 
payoff to a second exercise date b past the first exercise date A . Thus, even 


though at t the put spread could be in the money for DEF, ABC could refuse to 
pay and simply extend the original put spread’. In the actual deal, the put spread 
if extended past i, has modified strikes. In fact, the situation is even more 
complicated because ABC reserves the right to extend the option payoff a second 
time to a third date £. 


CVR: Averaging Specifications 
In fact, the stock options are not written on the stock price itself but on certain 
functions of the stock price. For f; , a set of time windows (labeled by an index 


k) is defined starting at 1 =t —kót, [бї = 1 day®, k =1...K,,,. ], with 


max 


^ The VCR Option. Another option, present in a related deal between Viacom and 
Blockbuster was called a VCR, short for “Variable Common Right”. Unofficially the 
name VCR was said to have been made up by the investment bankers because 
Blockbuster’s business includes rental and sale of prerecorded videocassettes played on 
VCRs. We will not look at the VCR in this book. 


5 Conditions for Extension of the СУВ: The conditions for extension were hotly 
debated at the time, and will be considered below in some detail. 


* Window Days Definition: These days are business days. The number of windows Kmax 
was specified in the contract. 
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length At The average of the stock price is then taken over each of these 


Window * 


windows, namely Six (К) = Average E Bd (5). Then the median of 


t e( pStart я pan 


k + Atwindow ) 


these averages over all windows is found, viz 5" = Median [54% (x)| | 
k=l... Kmax 

This И is then compared with the strike prices in the put spread. The same 

condition holds at the second strike date. 

At the third exercise date, the maximum of the averages is compared with the 
strike prices instead of the median of the averages. This point, in the fine print, is 
of extra value to ABC since it inflates the effective ABC stock price and thus 
diminishes the payout to DEF. 

The structure of the deal so far is summarized in the diagram below. The two 
boxes marked “Put Spreads” and “Extension Options" comprise ће СУВ”. 


Basic Structure of the M&A Deal 


Cash, stock 


ABC Corp. 
Acquiring 


Stockholders 
DEF Corp. 


Put Spreads 


| 


DEF Corp. 


Extension 
options 


Important additional complexity was present, including a “Chameleon Bond” 
and a warrant depending on it. These topics are in the footnotes below ® ?'°. 


7 Specification of the СУВ: The СУВ structure and parameters were specified by the 
investment bankers, who were provided with pricing information under various 
assumptions as described below. 


*The Viacom “Chameleon Bond” and its Behavior: To help fund the Paramount deal, 
Viacom needed to issue merger debt. The debt was itself a contingent object, and was 
dubbed the “chameleon bond". This was because the bond could change its character 
depending on future events - notably the possibility of acquisition by Viacom of 
Blockbuster Entertainment Corp. The specifications included an exchangeable feature of 
the debt into preferred stock at Viacom's option if the acquisition of Blockbuster was not 
completed by a certain date. The reason was that Blockbuster had strong cash flow, and if 
acquired Viacom would have the additional cash to pay the debt coupons. Otherwise, 
Viacom needed to change the bond into stock. Various other bells and whistles were also 
present. The logical flow diagram for the chameleon bond took up an entire page with 
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The Arbs and the Mispricing of the CVR Option 


Significant arbitrageur activity for a potential M&A deal is often present, where 
"arbs" bet on the deal going or not going through with appropriate positions in 
the ABC stock". This can have an undesirable effect from the point of view of 
ABC. If the arbs own the CVR, which is some form of put, they want the ABC 
stock to go down near the exercise or payoff time. Therefore, near payoff, they 
may start to sell ABC stock or go short. This puts downward pressure on the 
stock, increasing the CVR payoff to the arbs. In this case, ABC tried to thwart the 
arbs, partially by making the CVR option so complicated that the arbs would 
have trouble understanding and pricing it. The result was that at first the CVR 
was overpriced by being considered by the arbs as a single one-period put spread, 
ignoring the negative effect on the CVR value through the complex extension 
conditions. 

After the deal took place, the CVR was listed (on the ASE) and traded in the 
secondary market. At this point, the market wound up under-pricing the CVR 
relative to models. Hence, at both the start and afterwards, the CVR option was 
chronically “mispriced” with respect to “theoretical fair value". For discussion, 
see the footnotes’? ". 


small print, and was difficult to memorize. Nominally the bond was 8% debt with 
maturity 7/7/06. The bond was ВІ credit, and so was a high-yield “junk” bond. 


? Junk Bonds and Stock: The behavior of the price moves of the chameleon bond for an 
initial period after it was issued was highly correlated with Viacom's stock price moves. 
This is often the case with high-yield bonds, because the elevated credit risk for a high- 
yield bond is often regarded as making the bond risk similar to the risk for stocks in the 
eyes of investors. 


10 The Bond-Dependent Warrant was Another Complication: Warrants are call 
options that allow the warrant holder to purchase ABC stock when exercised. Warrants 
lead to some dilution, but they are less risky than CVR instruments because they are 
exercised when the stock is doing well, which lowers the dilution pain. The exercise can 
be performed, using cash or sometimes with other securities provided by the warrant 
holder, to ABC. Another complication in the Viacom-Paramount structure was a warrant 
that could be exercised either with cash or by using the chameleon bond under certain 
conditions. A precise valuation of this warrant including the convertibility dependence on 
the bond with its myriad uncertainties (including the probability of the Blockbuster 
merger) was not possible. In practice, this complex warrant was valued using the cash 
exercise assumption. 


! Acknowledgements: I thank Doug Hiscano for informative discussions regarding arbs 
and on many other topics. 


? Important Philosophical Issue: To Be or Not To Be Mispriced: So if the market 
price of a security disagrees with the theoretical valuation, who is right? An aggressive 
point of view announces that the market in such a case is "wrong", or "out of 
equilibrium", and given enough time the market will revert to the theoretical value. This 
sort of analysis is called “rich-cheap analysis". The time scales for the market to thus 
wise up and revert are often a bit hazy. If there is considerable empirical evidence in a 
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A Simplified CVR: Two Put Spreads with Extension Logic 


Many of the basic issues arising for the CVR can be explained using two put 
spreads expiring at different times A and A with no averaging side conditions, 


since everything can be done explicitly. These two put spreads will be taken with 
the same parameters as occurred in the CVR™. We need back-chaining logic. 


Back-Chaining Logic with Two Times 


The back-chaining logic is needed to discover the "critical decision indifference 
points" in the stock price at each exercise date, determining academically if ABC 


pays out or extends. The procedure for the simplified example takes place at A . 
At this time, ABC has to make a decision on what to do. ABC first calculates the 
intrinsic value of the cash flow that it must pay at [Л due to the first put spread, if 


ABC does not extend. ABC also calculates the second put spread evaluated at t А 
This is the discounted expected value of future cash payments, which ABC may 
have to make at time b . The academic critical value or indifference point $$ , 
at p occurs when these two numbers are equal. At this point, ABC should be 


indifferent whether to pay out or extend. We solve for $$ , using this formula: 


SE ype S A E 9C, аа |ы ЗБ 430 


is on the put spread ramp. 


provided $S, (£) € DES 


lUpper | 


given case that the market oscillates around the theoretical price, then there are good 
grounds for such a view. However unusual non-standard securities may not have any 
equilibrium behavior, or more to the point there may be many equilibria. 


P? Liquidity and Models: With regard to the previous footnote, one can always say that 
the market is, in its infinite collective wisdom, making a “liquidity adjustment", or “low- 
demand adjustment", or “perceived credit adjustment", or whatever, to the theoretical 
price. These cannot be calculated (else they would be taken into account in the theoretical 
price to begin with). Only direct experience with the market allows evaluation. 


" Parameters: The first two put spreads in the CVR had upper and lower strikes of ($48, 
$36) and ($51, $37). These are the put spreads in the simplified example. The third put 
spread in the CVR was ($55, $38). In the CVR, these numbers were compared against the 
functions of the stock price (median of averages, maximum of averages) discussed in the 
section on averaging. 
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Non-Uniqueness of the Indifference Point for Two Put Spreads 


Normally, there is one price region of extension and another price region where 
extension does not occur. A further complication occurs in this case, because it 
turns out that the indifference point is not uniquely determined mathematically 
from Eq. (13.1). This is because a put spread's intrinsic value at exercise can 
intersect another unexercised put spread (that has time value), at more than one 
point of the underlying stock. 

The graph below shows an example. The intrinsic value of the first put 
spread with $E, = $48 and $E, — $36 is plotted against the second put 


Upper = $51 and $E, = $37. 


These are the relevant parameters for the CVR. The example shows the existence 
of two indifference points near $28 and $39. Hence, the "rational market" logic is 
more complex than usual. 

The same phenonomenon occurs with the full CVR extension logic, although 
because of the complexity it is harder to visualize. The next graph shows the non- 
uniqueness of the indifference point in the example with two put spreads: 


Upper Lower 


spread with one year to exercise, and with $E, 


Lower 


Note Two Crossing Points Near $28, $39 


—ш— 51-37 put spread —a— Intrinsic 48-36 put spread 
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Non-Academic Corporate Decision for Option Extension 


An additional complication in pricing the option in a realistic fashion involves a 
non-academic argument that may be important in practice". The usual rational 
market academic dictum is not to extend if probabilistically more would have to 
be paid later. In that case, the standard academic logic would insist on a payout 
now, with no extension. However, ABC may want or need to ignore this 
academic criterion in favor of a more practical consideration. 

Specifically, even if it would theoretically "save money" not to extend, ABC 
may find it preferable at a corporate level not to pay out substantial sums’® in a 
weakened condition. For example, ABC may not have enough cash to pay out 
and may not be able to borrow the money easily, etc". So ABC may take a 
"corporate decision" to extend. 

Such a corporate decision is not a default of ABC, because no rational market 
logic is written into the contract with DEF; 1.е. it is entirely at ABC's discretion 
to extend. The corporate decision is also not a scenario "what if" analysis", 
because all stock paths are included in the analysis. Stock paths are not “cherry- 
picked", i.e. we do not look only at low stock-price paths but rather at all paths. 
Instead, the rational-market decision process itself is modified. 

Normally such corporate decision considerations are not included in option 
pricing". However, the true worth of the option to the holder and to the issuer do 
depend on these considerations. The usual academic rational-market procedure 


P Disclaimer: All material in this section relative to possible corporate behavior is 
merely surmised in the particular case involving Viacom, for illustrative purposes. 


^ Payoff Amounts and Corporate Decision to Extend: For Viacom, there were around 
57MM CVRs that would need the maximum payoff of $12 at the end of the first year if 
the stock dropped enough, and without a corporate decision to extend, over $680 MM. 


17 Loans to ABC to Pay off the Price Protection? It may be problematic for a bank to 
extend credit to ABC in this sort of situation. First, ABC may already be weakened by 
purchasing DEF, and second ABC may need to pay out a large amount on a put option 
just when ABC's stock is dropping. A loan could be problematic for ABC also; the 
interest rate charged under such circumstances could be high, and the additional leverage 
might wind up putting some pressure to lower ABC's credit rating. Substitution of stock 
would produce undesirable dilution and equivalence to issuing stock at a low price. 


'S Scenario What-If Analysis: A scenario what-if analysis corresponds to choosing a 
particular stock price path or a collection of paths. These analyses are extremely valuable, 
since economic considerations can be incorporated. However, for model option pricing 
we include all paths, with path densities in the stock price at a given time given by the 
model. Sometimes people may think of the consequences of some scenario and then 
generalize inappropriately to the option price. 


? Options, Cushions, and Corporate Decisions to Extend or Exercise: Sometimes 
extra parameters are built into option pricing to include corporate decisions to exercise or 
extend, including cases involving embedded options in callable corporate and convertible 
bonds. These parameters are sometimes called “cushions”. 
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determines the theoretical price. However if the rational-market procedure is not 
followed in reality, then an additional component of risk or value exists that 
cannot be determined by option theory. Without detailed knowledge of the 
individual corporate situation it is naturally a-priori problematic to assess the 
parameters. First, there is the probability (9¢,,,necm, that the corporate decision 


will be made. If so, there is the stock price $5 at which the extension 


CorpDecExt 
takes place. We do not try to estimate $ corppecgxi , and simply price the CVR both 


ways—with and without the corporate decision to extend. If ABC does decide to 


extend we need $$ реку, and we need to assess the effect on the option price. 


It would be preferable to adopt a method that does not require detailed corporate 
information. It turns out that there is a feature in the case of extendable put 


options that provides a natural choice for the quantity $5 CorpDecExt ` 


Compromise by Using the Two Indifference Points - Illustration 
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If, as in the above figure, two "indifference" points occur, a compromise 
solution can be found obeying both the rational market procedure and the 
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corporate finance extension considerations in different regions. This graph shows 
that we can use the upper indifference point as corresponding to the rational 
market procedure, and choose the lower indifference point as corresponding to 


the low stock price $S¢,.,nece at which the corporate decision to extend might 


be made. The possible corporate decision to extend is made when the intrinsic 
48-36 put spread is at its maximum value, $12. Possible extension when payout is 
maximal persists to the full CVR. 


The CVR Option Pricing 


We can now discuss the full complexities of the CVR pricing. The side 
conditions render the analysis complicated. Different paths in the stock price lead 
to completely different states and the result at times past an extension depend on 
what happened at previous possible exercise times. Twelve classes of paths exist. 


First Year Paths 
The following picture illustrates the first-year paths: 


First Year Paths for CVR 


$S(t...) 


issue 


* 
| 
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In the first year for the CVR, there are four classes of paths, starting at the 


spot stock price at the issue date f up to the first exercise time f , as shown 


Issue 
above. We need to calculate the median of averages, as described above, during 
the period marked Med.Avg before f on the diagram. 


The first-year paths are: 


e Class 1 paths are such that the median of averages winds up above the upper 
strike price $E upper for the first year put spread. In this case, the CVR is 
worth nothing. 


e Class 2 paths have the median of averages below $E, but above the 


Upper 
point $S , 1 · This is the “upper indifference point" that we get by rational 
market back-chaining logic. In this region, ABC pays DEF the value of the 
first put spread, $E upper — $S (f ) ; 


e Class 3 paths extend past A and follow the rational market logic. For these 
paths, ABC calculates a smaller economic penalty on the average for 
extending the option than for paying out at A , SO extension occurs. 


e Class 4 paths arrive below the “lower indifference point” $$ у, at fj. 


Normally there is one indifference point. However, for two put spreads there 
are two indifference points, as we saw above. There are two possibilities. 


According to the rational market logic, paths below $5S,,,_,, should stop, 
because ABC would calculate a greater economic penalty on the average for 
extending the option than for paying out at time f . If this happens, ABC pays 


-$E 


lLower * 


the maximum first put spread intrinsic value, $E, Upper 


However, just as in the simple example of two put spreads discussed above, a 
corporate decision to extend may occur. Again, this is not a feature of academic 
options pricing but because a weakened condition of ABC may occur. 


Second Year Paths 
The paths for the second year for the CVR are in the diagram below: 
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Second Year Paths for CVR 


SS ind 21 


The second-year paths are: 


e Class-5 paths finish out of the money and the СУК terminates at no cost. 
e Class-6 paths are terminated at i with a payout equal to the intrinsic value 


of the second-year put spread down to the point $5, ,,, where the payout 
stops, similar to the class-2 paths at t . 
e  Class-7 paths extend past b, through the standard back-chaining logic. 


e Class-8 paths would stop at i by the rational market logic and pay the 
-$E 


2Lower * 


maximum amount $£, However, these paths may extend 


Upper 
past t by a corporate decision to extend, as explained above. 

e Class-9 paths are a continuation of class-4 paths, which resulted from a 
previous corporate decision to extend at time t : 
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Third Year Paths 
The paths for the third year for the CVR are in the diagram below. 


Third Year Paths for CVR 


$E 


3Lower 


SE! 


Low 


The third-year paths are: 


e Class-10 paths are out of the money at A and are terminated without cost. 


e Class-11 paths are paid out at i as a put spread. As explained below, the 
maximum condition may be approximately taken into account by lowering 
the put strikes $E, $E, for purposes of calculation to 
$E $E; 


e Class-12 paths are the continuation of the corporate-extension paths but 


Upper? Lower 


Low * 


which end at a with payouts as above. 
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Analytic CVR Pricing Methodology 


An analytic method can be employed in practice. This method involves bivariate 
and trivariate integrals, with back-chaining logic. The bivariate integral occurs 


because the paths that terminate at A cannot range over all values at t . The 
trivariate integral occurs because the paths that terminate at t cannot range over 


all values at i, and t. The method is not exact, mainly because of the 


complexities of the averaging. However, corrections can be made to the volatility 
to account for the media of averages and maximum of averages in a reasonable 
fashion, as discussed below. 


Payout Details for the CVR 
The payouts are as follows: 


e For f , the payoff from the class-2 paths is a standard put-spread intrinsic 


value with upper strike $E u,e and lower strike $S , ,,. However, the put 


Upper 
spread is cutoff to zero at price $S ay- This cutoff can be modeled as а 


short position in a digital option that pays - ($E, -$S natu ) for 


$5 (4 ) < $$ у, ог actually the median of the averages ?°. 

e For А , ће payoff from class-6 paths again looks like that arising from а put 
spread minus a digital option. However there is an important difference, 
because the paths that extend past А are filtered. In some sense, we must 


multiply the payoff by the probability that the paths get past t . This 


translates into the bivariate integral"! 


? Digital Option: A digital option just pays off a constant value. Hence, this digital 
option is just an appropriate integral over the Green function for the first year, with 
appropriate discounting. We discuss digital options in Ch. 15. 


*! Bivariate Integral: Technically, in the path integral framework the successive use of 
the convolution semi-group property over all intermediate values converts short-time 
Gaussian propagators into long-time Gaussian propagators. However, this breaks down 
when intermediate values are restricted, as they are at t; so one additional integral 
remains. See the chapters on path integrals, Ch. 41-45. 
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e For t the buck stops, and payouts need to be made. The filters at the 


preceding times leave two extra integrals, and a trivariate integral appears”. 
This time the maximum of averages must be taken into account. 


The summary payout diagram for the whole CVR below should clarify the ideas: 


Summary of CVR Payout Logic 
xn 


Extension — ' 
. | MD ‹ $E ow 
Extension Trivariate 


=> Bivariate 


No payout if corporate decision to 


extend, otherwise max. payout 


Possible Irrelevance of the First Two Lower Put-Spread Strikes 
If ABC decides to extend the option at £, and 7, below the lower indifference 


points $5, 1,959, $Е 


; do not enter the 
payoff conditions. Hence, in that case these variables become irrelevant for the 


4151 then the lower strikes $£, 


Lower ? 2Lower 


22 Trivariate Integral: The same remarks as made for the bivariate integral apply, except 
that two extra integrals remain due to some paths terminating at the two previous times. 
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value of the CVR. Hence lowering the values of the lower strikes would not have 
increased Viacom’s risk under the corporate extension decision assumption. 


Back-chaining Logic for the CVR 


The back-chaining logic for the CVR is a straightforward extension of the logic 
presented in the example for two put spreads. Again, back chaining is needed to 
discover the academic decision indifference points in the stock price at each 
exercise date, thus determining academically if ABC pays out or extends. It is 
easier to implement this logic with the analytic approach rather than the Monte- 
Carlo approach. 


The procedure begins at the next to last exercise date Ü . ABC compares the 
intrinsic value of the cash flow that 1t must pay at b Gf ABC does not extend) 
with the discounted expected value of all future cash payments. At A , the future 
cash payments are those at time i; . The academic critical or indifference value of 


the stock price $5, , (£) at £, occurs when these two numbers are equal. 


A similar calculation is then performed at the first exercise date f . The 
expected discounted value of future cash flows at £} ,/; must be evaluated at time 
f; . This includes a bivariate integral to evaluate the expected discounted ¢, cash 
flows at time t . The calculation performed at n clearly affects the future cash 


flows at t . In the back-chaining procedure, the results provide the standard 
academic criteria telling ABC when to pay out and when to extend. 


The Averaging Conditions 

For each of the first two years, the median of averages over a window is 
specified. Correspondingly, the volatility used in the analytic method can be 
reduced according to a standard approximation". To be specific, the volatility in 


the presence of averaging in the time interval T, , roughly leads to a 


vgPerio 
reduction in the square of the volatility с? by a factor of 1/3. Because variances 
add in the analytic model, we get the average volatility o,,, given by 


2 =, wee 2 
Ong Tra =O T NondvePeriod +o T ivgPeriod /3 (13.2) 


? Volatility Modifications for Average Options: See derivation of volatility for average 
options using path-integral techniques (see Ch. 41-45). The factor 1/3 occurs in the limit 
of large numbers of observations in the averaging period. 
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For the median of averages, the approximation of substituting the average of 
averages can be used, and the averaging method is then applied a second time to 


the averaged volatility с. With the window interval Af,,,,,, we get the 
median of averaged volatility сл, as” 
2 2 2 
O vteddve 1 Total = О avg (Тыш ы Аа) + O wg Ab window /3 (13.3) 


For the third year specifying the maximum of averages, a 4-step binomial 
model was used in the window before A to get approximately the maximum of 


averages, and an effective lowering of the third year put-spread strikes was 
calibrated to agree with this model. Thus, an equivalent option can be defined by 


lowering $E, $E to $E;,, SE, in order to take account of the 


Upper ? 3Lower Low 


maximum condition. 


Some Practical Aspects of CVR Pricing and Hedging 


In this section, we give some illustrative indications for the pricing and hedging 
methodology for the CVR”. 


Standard Rational, Market Logic 


We first assume the canonical standard rational-market logic—i.e. no corporate 
decision to extend. On 6/7/94 (one day after being listed), the market value for 
the CVR was $7.625. The theoretical price of $8.96 and the market value $7.625 
differed by about 15%, the market CVR price being “cheap” with respect to the 
model. This “cheapness” continued to be true as the CVR was traded in the 
market”. Evidently, the market discounted the value of the СУВ below the 
model for some reason". 


? Numerical Volatility Modifications: The Viacom-Paramount CVR contract specified 
TAvg Period = 20 days апа Atwindow = 60 days (business or trading days). We took Troti = 
250 days for one year, a common assumption (sometimes 252 or 260 days is used). The 
reduction in the input volatility o was thus a factor 0.892 so o = 28% became 25% in 
years | and 2. 


? Indicative Prices: The theoretical numbers quoted in this section for 6/7/94 are 
intended only as an illustrative pedagogical example. Juan Castresana numerically 
implemented the full CVR analytic methodology. 


°° Arbs Again: It seems that at this time the arbs were no longer regarding the CVRs as 
1-уеаг put spreads, knew that the theoretical СУК price including the extension options 
was above the market price, and regarded the CVRs as cheap puts. 
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CVR Components 


The CVR component parts that need to be priced correspond to the diagram of 
the payout logic above. Explicitly, these are: 


First year put spread minus digital 

First year low indifference point digital 

Second year filtered put spread minus digital (bivariate) 
Second year filtered low indifference point digital (bivariate) 
Third year double-filtered put spread (trivariate) 


A simple one-year put spread was worth $10.17, so the (negative) extension 
value ($8.96 — $10.17) on 6/7/94 was worth —$1.21. 


Indifference Points 


It is important to calculate the values of the indifference points that we discussed 
above. These are: 


e Upper indifference point at f; : $S pau 
e Lower indifference point at f : $S ш 
e Upper indifference point at t3: $S pau 


e Lower indifference point at b UPS LI dr 


On 6/7/94 these were $S, y= $4325, $S aiL $24.55, $5, 20 
$44.98, $$ ,,; = $28.18. Note that the lower indifference points were below 


the lower put-spread strikes. Therefore, e.g., a stock price at ў below $$ 
-$E If a 


lLower * 


would correspond to the maximum put-spread payoff $E upper 


corporate decision to extend is made at ¢, , the condition SS срески = SS ma-i 


corresponds to avoiding this maximum payoff. This is similar to the two put- 
spreads example discussed above. We turn next to this possibility. 


Non-Standard Corporate Decision to Extend 


If, hypothetically, the corporate decision to extend is made, the value of the CVR 
corresponding to the above parameters increased by $0.44 to $9.40. Recall that 
the expected cash flow is increased because ABC decides not to pay, even if it 
probabilistically could cost more later. The digital options below the lower 
indifference points are eliminated for the first two years in this case. 

The CVR component parts in this case are: 
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e First year put spread minus digital 
Second year filtered put spread minus digital (bivariate) 
Third year double-filtered put spread (trivariate) 


Evidently, since the market price was lower than theoretical, either the 
market did not include this non-standard corporate decision to extend or else the 
discounting mechanism by the market from the model price already noted for the 
standard rational-market logic was increased. 


Maximum Condition and Averaging Contributions. 


We can look at the results with and without the third-year maximum condition. 
Recall that imposition of the maximum condition increases the effective ABC 
stock price at the third year, and so lowers the value of the price protection. The 
maximum condition lowered the CVR value by about -$0.28. 

Given all the press about the complicated averaging conditions, it is of 
interest to value the CVR with and without averaging. The averaging contribution 
on this day was worth about $0.24. 


CVR Valuation with Other Interest Rates 


The interest rate r mainly affects the option through the forward stock price. In 
the absence of dividends, the forward price at time interval 7 from now is 


$S (Т) = 85, exp(rT) (13.4) 


The results quoted above were for the 3-year rate at the appropriate ABC 
credit. Although academically we are supposed to use the “risk-free rate", using 
Libor in fact gave a CVR model result on this day of around $9.70. This was 
farther from the market. In order to drive down the theoretical price to the market 
price, a rate over 13% had to be used. We might regard this high 13% rate as a 
measure of the inherent credit risk placed by the market on the CVR. On the 
other hand, we might regard it as the penalty for reduced demand". 


CVR Hedging and Scenario Results 


In order to trade, we need to know how to hedge, and so we need the CVR 
sensitivities. We also want to run scenario analyses, revaluing the CVR at 
different stock prices and vols. While it might seem redundant to do both, for a 
highly non-linear object like the CVR the usual sensitivities are not so exact. The 
hedges used were the usual quantities”: 


27 Units and Risk Reports: To emphasize it again, because risk quantities can be and are 
defined in different ways, it is always a good idea to get people to specify the definitions 
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e Л = Delta (per $1 stock change) 
y = Gamma (per $1 stock change) 
Vega (per % volatility change) 


The hedging sensitivities calculated for 6/7/94 were A=-0.33, 
y = —0.02/$, vega = -$0.08 ?*. 


Physical Intuition 


It is useful to think of stock price moves mentally to get intuition and a physical 
feel for the risk. At $28 with a one-standard deviation move of around o$S = $7, 
and including the forward stock price change from spot of $1.30, the stock would 
likely be in the range $22 to $36 at the end of one year. This range covers the 


lower indifference point $5, , ,, to the lower strike $E, 


Lower * 


Indicative Scenario Results, Comparison with the "Math Calculation " 


Here are some scenario results. In practice, tables of scenario results may be 
provided to the trading desk. For fixed vols, interest rates, and time periods and 
stock price changes of $4 and $8, a “reval” calculation produces: 


e If $S = $32, ће СУК price = $7.50, a change $6C,.,,,— — $1.46 
e If $S = $36, the CVR price = $5.93, a change $8C,,,.,— — $3.03 


We can compare the “reval” results with the usual “math calculation" result 
ФС шкы = SAFS + 4$7(SS)Y + $vegaóo (13.5) 


We get —$1.48 and — $3.28 for $C 


М in the above two cases. For small 
fathCale 


moves, we naturally expect $C ymca ~ УО Сема. The difference between the 


Math Calc and Reval is $0.02 at $55 = $4, and $0.25 at $55 = $8. 


We can also look at the change in delta A. The conclusions are similar, 
OA vacate = yoS * 


including units. It is an even better idea to ask the programmers in the appropriate 
department to program the computer to print out this information in unambiguous, 
complete, recognizable English, free from local acronyms, directly on the risk reports. 


28 Vega for СУВ: The sign of vega depended on the stock price. At other stock prices, 
vega was positive. 
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The CVR Buyback 


It is clear that ABC would like to terminate the price protection. We have spent a 
lot of time in this chapter discussing under which conditions ABC would or 
would not be likely to extend, mainly concentrating on the dangerous (for ABC) 
low stock price paths. In fact, it turned out that the ABC stock price rose after the 
acquisition, and the ABC management then bought back the CVR option from 
the stockholders at a nominal fee?. The corporate risk management using the 
CVR option was successful from the point of view of ABC, and DEF 
stockholders enjoyed the partial price protection required to make the deal go 
through at the start. In sum, the CVR was a success. 


A Legal Event Related to the CVR 


Legal risk is an increasingly important topic. Quants might become involved in 
exceptional cases when complex securities are involved ?. A legal event arose in 
connection with the CVR option. The information below is public record ". 

An intellectual-property case was decided by New York Stock Exchange 
arbitration on 6/11/97 on a claim by T. Inman against Smith Barney regarding the 
Viacom/Paramount bid including the CVR. The total damages asked were over 
$15,000,000. Smith Barney was the advisor to Viacom, and the Smith Barney 
investment bankers designed the bid including the CVR. 

The decision, in favor of Smith Barney, can be accessed on the Internet 
(NYSE, ref. ii). 


? Viacom's CVR Buyback: Viacom paid $83MM for around 57MM CVRs at a value of 
$1.44 @ on 7/7/95, corresponding to a median average stock price of $46.56. The 
buyback corresponded to the class-2 paths that paid off at t; in the analysis. The initial 
valuation of the CVR was substantially larger than the actual CVR payout, so Viacom 
essentially realized a profit on the CVR. 


20 History: I was the quantitative advisor and the designated expert for Smith Barney 
during this arbitration. The experience was absorbing. 
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14. Two More Case Studies (Tech. Index 5/10) 


In this chapter we consider two more case studies of structured exotic products. 
We first consider DECs and synthetic convertibles, including simple and 
complex varieties. We then go on to consider an exotic equity call option with a 
variable strike and expiration. 


Case Study: DECS and Synthetic Convertibles 


Some simple forms of convertibles are synthetic convertibles called DECs. A 
convertible bond is a security has both bond and embedded equity options with 
possibility of conversion to stock’. A synthetic convertible is generally simpler 
than a true convertible bond. A synthetic convertible example is the DECS 
(“Debt Exchangeable for Common Stock” or “Dividend Enhanced Convertible 
Stock”). It is common to use DEC for short pronounced “deck”; the plural is then 
(somewhat inconsistently) DECs. DECs are structures popular in corporate 
finance’. A DEC can be written explicitly as a sum of the underlying stock, 
coupon payments, and equity options. A true convertible bond is on the other 
hand a composite structure that requires complex pricing models. 

We start with simple DECs and then proceed to a more complicated DEC 
(called DEC;5;) that will be used to study several ideas in the same example. The 
discussion will also give an idea of what is sometimes asked of quants in practice 
for coming up with quick "indications" for ideas generated by Corporate Finance. 
We will walk through this calculation from soup to nuts. The РЕС; әз has these 
attributes: 


e Itis a “Synthetic Convertible" involving DECs. 
e It has a “Best Of Option” ^? among three different DECs. 


' An Art of Corporate Finance: There are many similar products: ELKS, ACES, 
EAGLES, MEDICS, SUNS, SPIDERS, FLIPS, etc. It seems that making up structured- 
security acronyms as words is a skill honed to a fine edge in Corporate Finance. 


? *Best-of Option”: These clauses in the contract allow the choice of the “best of" a set 
of results by the investor X at a given time. In the present situation, X gets to choose the 
best result from three DECS. This provides additional value to X through diversification. 
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We shall look at these features in turn °. Those just interested in standard 
DECs can read the next section and skip the best-of option section. 


Standard DECs (These are simple) 


The idea is that the investor X gets an extra coupon above ABC dividends due 
to an exchange of equity options favorable to АВС. The stock on which these 
options are written may or may not be ABC stock. At the same time ће DECs 
allow ABC to issues debt in an advantageous manner. A prototype DEC has the 
following composition: 


e A coupon is paid to X (calculated as a function of the structure). This is 
often re-expressed as a DEC-dividend yield. 


e A European call option С is owned by ABC with a strike E; generally 
set at current stock price S, at issuance. 

e А European call option C, is owned by X for a fraction 7; « 1 of a share, 
with a strike E,, set at a higher level? Ey = GE, with ë =1/7. 

e Тһе DEC behaves like the underlying below the current stock price Sp. 


Therefore the investor X suffers losses if the stock price drops. 


The payout diagram of a DEC at maturity Т " from the point of view of the 
investor X is provided below: 


> Remark: Although DECs are common, this DEC); invented by Corporate Finance was 
not marketed. Here it just serves as a good laboratory example. 


^ Exchangeables: If the stock on which the options are written is not ABC stock, the 
option is called an *exchangeable". 


* DEC Jargon: /n general it is important to learn the jargon, because that is the 
language of the deal description. Moreover, the description may occur at rapid fire pace. 
If Ej = 100 and & = 1.25, the DEC is said to be “100 up 25”. The “conversion premium” 
is & - 1 or 25% and the number of shares т received by X upon exercise of the Cx option 
n = 1/6 = 0.8 is the “participation” or “conversion ratio". The “conversion price" is Ен. 


Chapter 14: Two More Case Studies 203 


DEC Payout at Maturity 


“Up region’ 
Slope 7 «1 


, 


43 A ERI 
*Down Flat region 


region" 


DEBET ж 


At the start of the deal (issuance) the stock price is called S,. At T * the stock 
price is denoted as S* = S(T*), so if S* < S}, the investor X has a loss (the 
“down region"). For Е, > $*» S, the payoff is flat due to X 's short Сус 
option position. The payoff increases with slope 7 above Е, due to X ’s long 
С, option at participation 7]. 


Various products exist with different numbers of options, resettable strikes, 
callable structures?, American options, etc. We will not consider them here. 


Analysis of a Standard DEC 


The analysis of the DEC described above is relatively straightforward. A diagram 
of a typical DEC structure is: 


° Callable DEC: The issuer ABC can call the structure if call provisions are provided. 
These call provisions are similar to those of ordinary debt. They include a “call schedule” 
containing each call price or amount ABC will pay X if ABC exercises the call, when the 
call is allowed. There may be NC (No Call) provisions or “protection” during a time 
period when no call is allowed. ABC may call in order to “force conversion”. In this case, 
X will have an American option allowing exercise at any time, and it will be better for X 
to exercise the option to convert to stock rather than accept the call price. 
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Structure of a DEC 


Issuer ABC Investor X 


Call C jac 


The present value (i.e. discounted) for the total sum of coupons paid to X up 
to maturity Т * (over and above any standard dividends) must, for fairness, be 
equal to the difference of the values of two options. These two options are the 


call С вс that X has given to ABC, and the call C, that ABC has given to X. 
Now Сус > Cy so X does get a positive coupon. The option values are 


calculated at the start of the deal when the coupon is specified. The options are 
generally simple European options that can be priced with the Black-Scholes 
formula. We have 


PV(>_$Coupons) = $C с —$Су (14.1) 


The DEC is quoted using equivalent dividend yield у, (constant spread) that 
produces the present value of the sum of coupons. Actually, there is a “bogey” 
dividend yield y,%%°", attractive enough to investors to make the deal sell. 
Denote r,,. as the ABC corporate rate that is used to perform the discounting of 
coupons, and call f the number of coupons paid per year up to N years. Then 


the yearly $Coupon is the result of performing a standard geometric sum, 


(ова) 
(erue! fy" -1 


$Coupon — pee SS = ($C we — SC ras (14.2) 


Credit Spreads, Discounting, Convertibles, and DECs 


One thorny discussion topic is which interest rate to use for pricing the options in 
a convertible-type security". In particular we want to ask the following question: 
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e Which Interest Rate Should Be Used for Equity Options — the ABC Credit 
Rate r,,., or Libor, or Both? 


Get ready for a confusing situation. This issue comes up in pricing both 
synthetic and standard convertibles. We start by breaking out the options from 
the DEC coupon—which could be done here. The options could be traded 
separately in a secondary market. 

The following considerations would also be involved in determining the 
credit FVA (Funding Valuation Adjustment), discussed briefly in Ch. 31. 


Textbook Argument: Use Libor’ 


A textbook equity-option model specifies a “risk-free rate”, but the question is 
more complicated in real markets. Note that increasing the interest rate r 
generally increases the price of the equity option. The forward stock price S pq at 


time T* is proportional to exp(rT*) and occurs inside the expectation 


integration over 5 *. The increase of this term with increasing г generally is 
larger than the decrease due to discounting with discount factor 


DF = exp(-rT*) , an overall factor outside the 5 * integration. A broker-dealer 
BD hedges equity options using BD funding (near Libor); this might argue 
both for the forward stock price and for discounting. 


r= Tibor 


Other Arguments: Use Corporate Rate with the Credit Spread 

If ABC were to hedge the options, the amount a bank or the markets would 
charge for ABC's hedging procedure would, indeed, include the ABC credit 
spread. This argues that from АВС °ѕ point of view, r = r,,-.Ifthe investor X 
could enter the argument he might say that since he is forced to take ABC credit 
risk, he would also use r,,. everywhere. The options would then be priced 
higher, and justifiably from X's point of view would give X an increased DEC 
coupon, compensating X for the ABC credit risk. The BD could reach the same 
conclusion because these options are embedded in an ABC credit-sensitive 
structure and so carry ABC credit risk. 


7 Risk-free rate of Libor vs. OIS: This section was written before the 2008 crash. These 
days, discounting would be done using OIS. 
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Compromise Point of View: Use Both Rates 
An intermediate compromise point of view uses both r,,. and т rates*. The 
argument here is that since S, controls stock (an asset) appreciation and by no- 


arbitrage should use a risk-free rate (Libor) so 5,, = exp(ri,,,T*). However 
the discounting of cash flows inside an ABC -issued credit sensitive security 
should be at r,,,., so the discount factor should be DF = exp(-r,4-1*) . 

The discounting rates can be further refined depending on the final stock 
price S *. If S* < S, there is arguably more ABC credit risk at T * than there is 
today, so the discounting rate should be above today's ABC rate, i.e. r > г, . If 
S* > S, there is less credit risk in the future than there is now, so r < Гул. 


Derman et al. (Ref.) take the view that r = гу at low S* while r = у, at 


high S *, since at high enough S*, conversion is certain at T *, and the stock 
can then be hedged using a risk-free rate. Ref. iii suggests a blended discounting 
rate at intermediate 5 *. 


Finally, there are models that incorporate spreads for S, but discount at 
Libor’. 


What's the Bottom Line? 


Who is "Right"? As elaborated more fully later (Ch. 32, 33), there is no real 
Theory of Finance, just phenomenology. Different people support and use 
different procedures and different models'”. 


* Two Rates in an Option Formula? It is possible to have two interest rates in an option 
formula. For example, treasury-bond short-term options use both repo and Libor rates. 


? Coupons and Discounting with the ABC Credit Rate: The DEC coupon will be paid 
at regular intervals and act like coupons in a standard bond. The discounting of these 
coupons therefore should include the ABC credit spread over Libor, as any bond issued 
by ABC must. It is easy to see this. A par bond issued by ABC must give an extra coupon 
to X to compensate for ABC credit risk, and the bond is at par when the discount rate 
equals the coupon, thus including the credit spread. This extra credit-related amount built 
into the coupon should also be present in a synthetic structure issued by ABC. 


10 Sociology: People can sometimes act a little like religious zealots regarding models. 
Discussions can degenerate to “This model is right" and “That model is wrong". It can 
require diplomacy to avoid getting into turf model battles. 
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D,23 : The Complex DEC Synthetic Convertible 


This section describes a more complicated DEC with a complicated option called 
a “best-of option”. A best-of option payoff involves the choice of a maximum of 
several payoffs. To get the option value, this payoff is multiplied by the 
appropriate probability function (Green function), and then integrated and 
discounted. In fact the DEC,,, structure involves three separate DEC, (a = 
1,2,3). The options are written on three separate stocks whose prices are called 
S,,. It is possible that none of them will be the ABC stock. The maturity date 


T * and the coupon paid are the same for each of the DECs. The best-of option in 
this case allows X to choose the best payoff from any of the three 
DEC, . Recall that for the standard DEC, the investor X gets a coupon because 


X gives up a net optionality to АВС. Because the best-of feature helps X , the 
options become less of a liability to X . Therefore the coupon X gets paid 
by ABC will also be decreased". 

Although the coupon has decreased relative to a standard DEC, the best-of 
options give X the chance for optimized up-side results. Although the overall 
value calculated at issue ¢, will not change due to this feature, X may have a 
scenario whereby he believes that the ultimate payoff will be more because of 
this best-of feature. If the stocks are chosen from different sectors, for example, 
better diversification will be achieved, giving potentially better results. 

The best-of option in the structure is a composite object so the individual 
options can no longer be broken out. The coupon will be different than in the 
simple DEC, as we shall calculate. Here is a diagram ofthe DEC,,, structure: 


Best-of-Three-DECs D,,, Structure 


[sem] 


Issuer ABC Investor X 
Best-of-option: 
Calls C to X 


Х,а 


Calls C авс,а to ABC 


1 Alternative: Alternatively to lowering the coupon, ABC could increase the conversion 
price, i.e. move to higher value the strike of the Cx, options. This would make the Cx, 
options individually less valuable but get to the same total option value through the best- 
of feature. 
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Pricing Рз: A Preview of Green-Function Techniques 


In the rest of this chapter, we need to increase the mathematical firepower a bit. 
We will use some results from path integrals and Green-function techniques for 
pricing options. The interested reader can get a full exposition in Ch. 41-45. 
Assuming lognormal stock dynamics the result for the Green function of n 
stocks is naturally an n -variate normal distribution ^'". We present it in general 


form for use later. At the initial time /, set Ka In| s А (t, )) , set 
хь = In| Sa (tp) | at the final time 1,, and set Т =t, —f,. The volatility of 
the time difference d,x, (f) = x, (t+dt)—x,(t) is Casa over the time window 


T,,. The correlation" between d,x, and dx, 15 p,,,5 over the time window 


: nep д ARA . -1 . 
Т. The correlation matrix is ( Pur) with matrix inverse is ( Pa) . Again a,b 


label the times and а, 8 label the matrix indices. The Green function is ": 


Go (же aa |= O(T,, )exp(—7,7,, ) 


| det (Pa E П y 2707, T, 
azl 


exp[ Qo ] 


(14.3) 


Here, 


'? Volatility and Correlation Input Choices: The choice of correlations and volatilities 
is critical. If historical volatilities are used they are often calculated over historical time 
windows with lengths equal to the forward time window of the deal Tæ = T* - to on the 
grounds that history over time periods of similar length to the future time period may be 
the most relevant. Implied stock volatilities can also be used, but it can easily happen that 
the time window for the deal may be longer than the maturities of existing available 
options. Finally, as emphasized later in the book, correlations are unstable over windows 
of interest and probably stochastic. Therefore stressed values of the correlations, 
consistent with a positive definite correlation matrix, need to be used in order to 
determine the correlation risk. 


"Notation: dx, dix and Dx THESE ARE ALL DIFFERENT: We need to be careful to 
distinguish various differentials. First, dx,(t) is the differential of values of x,(t) at one 
single time t. This appears in the measure of the Green function written as a function of 
the {Xa}. Next d,x,(t) = x,(t+dt) - x,(t) is the time difference of x, at neighboring times 
t-dt and t . This is used to get volatilities and correlations. In practice dt will be a time 
small compared to Ta; popular choices include dt from 1 day to 3 months. Finally Dix is 
the finite-time difference generalization of dix over time T». It arises when successive 
Gaussian propagations are convoluted (using the semi-group property) to get a Gaussian 
over finite T4, . Which differential we are using will in any case be clear from the context. 
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ФО = = Ур, Хар (ољ) ОЕ / EAE (14.4) 
а,В=1 


Also? 
D,x y. 3a zou а Хаа a Maba lab ] (14.5) 


As usual the drift р, is given in terms of the interest rate r,, over Т, the 


stock dividend yield y,,., and the Ito correction term due to the lognormal 


dynamics assumed as 
2 
Haba = rab — Yaba mo /2 (14.6) 
The measure for given iu integrating over DM is the simple product 


of ordinary integrals, namely П Í X - 


а=1 о 


The Payoffs and Calculation of the Best-of Option 

The calculation of the Best-of-Two DEC,, value is the integral over (57,55 ) 
of the (a, P ) bivariate (n = 2) Green function with integrand multiplied by the 
T* payoff, Cprc, (T*) = max (DEC, DEC, I(T There are three of these 
БЕС» with (о,В) = (1,2), (1,3) and (2,3). The calculation of the Best-of-Three 
DEC», option value is the integral over (S; 15245; ) of the trivariate (n = 3) 
Green function integrand that is multiplied by the DEC,,, payoff at T*, 

(T*) = max (DEC, , DEC, , DEC, ym *). All results are then discounted. 


The payoff for the three-variable case is somewhat messy to administrate 
since there are 5 regions due to the three separate regions for each DEC,, 


Coc, 


namely: 


1. АП B. in the “down” region. In this region, X has lost money due to 
downside risk, but at least he has minimized the loss. 

2. At least one S, in the "flat" region but no S i in the “up” region. Here X 
gets back the notional even if some stocks have dropped. 
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3. Опе S i in the “up” region. One of the C, , options is in the money. 


D 


. Two S}, S5 in the "up" region. Two C, , options are in the money. 


сл 


‚ АП SŽ, S3, S% in the “up” region. АП C x a Options are in the money. 


The corresponding maximum payouts to X in the five regions are then'*: 


5, 
1. зума, | Zu 


2. $N 


3. SN [1e n[(s; -Е„„)/'5. ]| 
4. SN {I+ Мах, Al (S: - Ena! Soa | (5, nll 


5; 


su i+ ns (Tis, 6/5 Ns - 5s] 


(14.7) 


Using $,, = 9E, ,. we can simplify the payoff in ће “up” regions using 


EE (SEn) (14.8) 


0a 


Indicative Pricing 


An indicative price is a non-binding price transmitted by a marketer to a client or 
other counterparty that often has a critical time element, leaving little time for 
detailed quantitative modeling analysis. Indicative results therefore sometimes 
need to be obtained using simplified input parameters and/or an approximate 
numerical method. Perhaps a similar analysis has essentially already been done 
and can be quickly modified from this “off-the-shelf” model, but this is by no 
means always the case. 

Regardless of the indication, if a product is utilized in a deal with a later 
secondary market for trading, a thorough numerical analysis has to be 
programmed along with hedging, and put into a risk-management system. 

The origin of this chapter was in fact an indicative price for a deal that never 
got done, but the price had to be obtained quickly. The numerical approximation 


14 Normalization: $N is the overall normalization or notional. 
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used was a coarse bucketing approximation for the payoffs, although calculations 
of the bivariate and trivariate normal probabilities were performed accurately. 


Denote у", as the average of the DEC, dividend yields, у кс the 


average two-stock DEC, в dividend yield and y, 54. ће DEC,,, dividend yield. 
The results satisfied the inequalities expected physically: 


Ус > Уз овс > Ys-pgc (14.9) 

Again, these DEC dividend yields are in addition to any standard stock 
dividends and are the equivalent yields to the coupons paid to X due to the 
option structures of the product in the various cases. The best-of options did have 


a significant effect. This effect increases with the number of choices, as was 
expected intuitively by Corporate Finance". 


Case Study: Equity Call with Variable Strike and Expiration 


The logic is pictured in the illustration below: 


Option Logic Dependence on Paths 


5 ies ) 


Unchanged 


Extended 
LT TT | 
1 
: 5 


5 Corporate Finance and Intuition: Quants should be aware that before being 
approached, a proposed transaction has been discussed enough in Corporate Finance so 
that these people have good intuition of what is going on. You should also try hard to get 
an intuitive feel. If your result does not agree with intuition, you may have misunderstood 
something, always possible in quick conversations. The ideal situation is to get a term 
sheet describing the proposed deal unambiguously, even for an indication, in order to 
help avoid such misunderstandings. 
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As pictured above, this option has some complicated logic. It is a European 
call option with attached side conditions. The basic option has strike E and 


expiration date f . At a given decision date ty the stock price S [Eo is 


lecision 


compared with three given levels S,, S,, and Е/ A with S, < 5, < Е/А. Неге 
A > 1 is a parameter. The current price S, at time f is in the band (S, „EJ A) : 


e 5 (t ee ) < S|, the option expiration is extended to t +l 


© I£ S, <S(tyccision) < E/A the strike is lowered to E; = AS (tyccision ) 


e Else the option remains unchanged 


The idea 1s to provide more value to the investor X . If the stock price drops 
substantially as specified by the first condition, additional time value is added 
with the possibility that the stock price will go back up. If the stock price is in the 
band specified by the second condition, the strike is lowered giving additional 
value to the call. 


Note that there is no constraint on the paths before or after t . Therefore, 


decision 
although the picture is not drawn this way, paths in any of the categories can 
cross back and forth over the horizontal boundaries. The classification of the 
path-dependence aspects of the option occurs only at f 


decision * 
Additional complexities also exist. The decision is made not on the value of 
the stock exactly at t but on the price averaged for a certain number of days 


decision 


At,,, before and up to f . Such an averaging specification is quite common. 


decision 


Theoretical Valuation of the Option 
The European option can be valued formally using the path integral techniques 
presented later in the book "6. The formalism pictorially follows the above 
diagram. Define the Green function "^" 


^ Path Integral Remarks: You do not have to understand path integrals to follow this 
section. Just look at the diagrams that correspond directly to the formulae. For simplicity, 
we neglect the averaging before the decision time, although this can also be included. 
This section is based on standard lognormal dynamics. If you already know path integrals 
this is just free field theory with some boundary conditions. See Ch. 41-45 for details. 


Semantics: As was once explained to me, it makes sense to say Green function, not 
Green's function. You don't talk about a Bessel's function, right? 
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e(r,, )ехр [ru ) 


yT т T, 


Gap = G(x, x11) = exp[-®,,] (14.10) 


Here? 
Фь = [55 —X, E755 N / pa A (14.11) 


The logic can be thought of as two exotic options given to X that expire at 
and result in two knock-in options" under suitable conditions, namely: 


that knocks in if S (7) < S, 
knocks in if S, < (шю) < E/A 


The diagram below introduces some notation for the calculation: 


Notation for Green Functions and Integrals 


t 


decision 


e An extension option Cyr 


e Astrike-lowering option С 


ower Strike ? 


18 Variables: Here t, is an arbitrary initial time, x, = InS(t,), ty is an arbitrary final time, x, 
= |nS(tj), o4 is the volatility, г.ь the “risk-less” interest rate, Usp = Tap - Ушу - Gab /2 , Уау 
the dividend yield, T,, = ta- t; and © the Heaviside step function forcing Tæ > 0. Further 
explanation is given later in the book. 


? Knock-in Options: These are barrier options that spring into existence if the knock-in 
condition occurs. The formalism of barrier options is described in detail in Ch. 17-19. 


214 Quantitative Finance and Risk Management 


In the figure the arrows point backwards because we propagate cash-flow 
information backwards from expiration. The option can be broken into 3 parts. 


C ( S, И lo ) = Cunchanged F Cua Time + Cou Strike (14. 12) 
Here the three component integrals are: ^" 
* * 

Unchanged = Í dX jec 1 dx, ` Go dec G jec, Expl ` (S; E Е) 

| Sace (S152) | За >Е/А] ШЕ 

* * 
Сым Time ^ Í dX tec 1 dx, 8 Go дес O dec, Exp? à (s; ш Е) 
Saec «S, In E 


AX Joc j dx, -Go dec Сагс, Exp! (S - AS. ) 


S5<Syoe<E/A In(ASjec) 


С 


Lower Strike x 


(14.13) 


These integrals can be performed as bivariate normal integrals. Some 
approximations have to be implemented to include the averaging before t 


decision * 


We shall not pursue it further, but go next to a Monte Carlo valuation. 


Monte-Carlo Valuation 


In order to price a complicated option like this, in general a Monte Carlo (MC) 
simulator can be used". Then the logic tree can be implemented in a 


. THES 23 : 
straightforward way’ “". Given some parameters”, the option value can be 


? Notation: To save space, we write “Saec” instead of S(tdccision) and use “Exp1” to denote 
the normal expiration t* where the stock price is called S;* . Similarly “Exp2” is ће 
extended expiration t*+1 where the stock price is called S,*. All x variables are InS with 
the appropriate labels. 


*! Forward and Backward Propagation: The Monte Carlo simulator propagates paths 
going forward starting at So. In practice this is more convenient than propagating paths 
backwards because only a small fraction of paths propagated backward will arrive at So 
within any dSo. The difference between forward and backward propagation only matters 
when there is a stock-dependent volatility o[S(t),t] such as occurs in “volatility skew”. 
For a constant volatility such as assumed here there is no difference. 


? Implementation with Monte Carlo Simulation: The numbers in this section were 
obtained with the software Derivatool" from Financial Engineering Associates, Inc. My 
understanding, based on a conversation with Mark Garman is that Derivatool works i ina 
manner similar to the theoretical presentation above, jumping over all times where paths 
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found. In this case we get $C = $3.71 with 154 statistical MC error + $0.08. It is 
interesting to ask how much of the value was due to all this complexity. A plain 
vanilla 3-yr call with the same parameters is $C, —$3.49 . The difference 


is therefore? $5C = $C -$C 


lain Vanilla 


= $0.22 + $0.08. 


PlainVanilla 


“Back of the Envelope” Calculation 


Now we will do a back-of-the-envelope calculation for $6 С. This means we will 
make simplifications, and see if we can get the right “ball-park” answer”, say 
within a factor 2 for $C . There are several good reasons for being able to do 
this. 


e Estimates performed after-the-fact of doing the MC calculation helps us 
understand the answer”. 


are unrestricted with no logic constraints with a single Green function. This is applicable 
from to up to the táecision - At; and then from taecision to expiration (either ё, = t* or 
t*Exp2 = t*+1). The MC results are of course averages with statistical errors for a certain 
number of trials (here 10K). In practice the daily 20 bd averaging was simplified. 


? Input Parameters: Starting stock price $Sọ = $19.875, lower boundary $$; = $16.50, 
middle boundary $S; = $18.00, parameter X = 1.2, strike $E = $25, time from start to 
expiration t* - to = 3 yrs, time to decision tgccision - to = 0.79 yrs, lognormal volatility 
(chosen constant) с = 0.31/yr'” , interest rate r = .0625/yr (continuous, 365), dividend 
yield ya, = 0.012/yr (continuous, 365), averaging period Atavg = 20bd (business days). 
These parameters can move around when working on a transaction for a variety of 
reasons (changes in the deal definition, changes in the market, refinements in the analysis 
etc). One refinement of the analysis would be to take a term structure of volatility o(t) or 
even include some volatility skew (although this would be unlikely for a transaction 
involving a single stock). 


24 «Errors and Uncertainties”: Although the statistical MC error in $C was $0.08 (only 
2% of $C), the resulting statistical error in $C is large (36% of $6C). Some non- 
technical people will believe that if you report an “error” you have made a mistake. It 
may be better semantically to discuss “uncertainties” rather than “errors”. On the other 
hand, there are also uncertainties in the input parameters to the calculation, 
simplifications in the lognormal model assumption (because real stock movements differ 
from lognormal), etc. Therefore it is in fact a mistake (a real mistake) to report a small 
MC error and imply that this is the only uncertainty in the calculation. 


? Back of the Envelope and Management: Scientists and engineers will know what is 
meant by “back of the envelope". However a Wall Street manager may make up some 
accuracy target that is acceptable to him and then assume (or request) that your quick 
estimate will satisfy that accuracy. For this reason you need to be extremely careful about 
how you discuss approximate calculations, and make it clear that you only believe the 
results are accurate to whatever level you think they are accurate. 
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e If you are on the desk and the salesman wants to have some sort of 
“indication” (as he will call it) by 2pm, you need to be able to do this sort of 
calculation without waiting for 2 months while it is being programmed and 
getting Put In The System. 


What is really going on? First, with definite probabilities there are transitions 


from S, to various regions of the 5, axis where the logic occurs (again “dec” is 


dec 
short for decision). We can get these probabilities straightforwardly from the 
normal distribution. Then we can figure out what happens after tje- 


Extension Option Contribution to $C 

Consider first at the extension option. If 5, < S, , the option is extended. So the 
contribution of the extension is really the difference of a call from tye to t +1 
and a call from Т to Ё. In order to price this difference we need to assume a 
reasonable starting point for Sge at tje- We collapse all 5, < S, paths” to 
S с = S1- We know this is an overestimate since paths starting at lower S, 
contribute less than paths starting at S|. However the strike $E = $25 is above 
$S, = $16.50, so this part of the option is out of the money, and the contribution 
will not be very large anyway. The contribution to $C from the extension 


option can then be approximated as the probability P|s < 5, | that $,, < Si 


dec 


at t,,. for paths starting at S, times the difference of two ordinary call options: 


dec 


°° Why You Should “Understand” the Answer the Model Produces: You should feel 
comfortable intuitively with the output of the model. Always perform sanity checks. Here 
are four good reasons: 

First reason: Because your management or other interested parties may well want to 
know why the answer that you got in the model is so small or why it is so big. Here is the 
way this will happen. Person In Authority: “I don’t believe that result". You: *...". 

Second reason: The next time this sort of problem comes up, you will be better able 
to deal with it. 

Third reason: A prototype rough calculation is always useful for sanity checks before 
the Real Production Model is developed. 

Fourth reason: Maybe the model calculation actually has a real mistake (math, 
parameter input, programming, reporting, misapplication, ...) that you will discover by 
performing rough calculations. It is bad practice to believe an answer just because it is on 
a printout from a computer. 


7 Approximation: It is a good idea to keep track of the assumptions that you are making 
so you can refine them later if you have time and if it fits in the priority list. 
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OC axa Time ^ P[ Succ $ S,]-{C[ Sa = Shige +1] -C| Si = Stt } 
(14.14) 


Numerically this approximation turns out to give $C ка time © $0.19. 


Lowered Strike Option Contribution to $6C and а Consistency Check 


The strike gets lowered if S} is between S, and E/A; numerically this means 


dec 


in the region $18 to $20.83. This is a band narrow enough so we can try using the 
average of these two numbers, $877 = $19.42, to start this class of paths from 
tec The lowered strike in this approximation is $E, = A$S7* = $23.30. Hence 


the lowered strike option contributes to $C in this approximation is 


óC 


Lower Strike 


Кс anus OI SS at el 


= P[S, < S4, « E/A]- duas 


Here, Р|$, < Sjee < ЕГА] is the probability that S, < Spe < EJA at Т for 


= $0.10. 


Hence the total for our BOE (back of the envelope) calculation is 
$S Сов Cale — $8 Ск а Time + $6 C, ower Strike ex $0.29 $ 


Now the result from the MC calculation was $ёС = $0.22 + $0.08. We 


see that we are indeed in the same ballpark, and are even (mirabile dictu") 
within the MC errors. Because the biggest contribution was the extension option 
for which we know we made an overestimate, and because the calculation is 
bigger than the MC result, there seems to be a consistency". 


paths that start at Sọ. This approximation produces $óCj, 


ower Strike 


?* Why Did This Approximation Work? I think it is mostly because $5C was small. In 
other cases, we may not be so lucky. Other cruder approximations did not work as well. 
All I would really want to claim is that the approximation to $C is perhaps within a 
factor of 2. 
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15. More Exotics and Risk (Tech. Index 5/10) 


This chapter examines several exotic options and some related risk measures. The 
topics are: 


Contingent Caps (Complex and Simple) 

Digital Options: Pricing and Hedging 

Historical Simulations and Hedging 

Yield-Curve Shape and Principal-Component Options 
Principal-Component Risk Measures 

Hybrid Two-Dimensional Barrier Options 

Reload Options 


Contingent Caps 


Clients and Contingent Caps 


In this section, we describe an exotic product called the contingent cap'”. As а 
client, a corporation ABC wants to buy a cap from a broker dealer BD. As for 
any cap, ABC's motivation is to get insurance to protect against rising rates. The 


cost of a standard cap $C,,, is paid up-front in cash. This outlay can be 


substantial and is lost if ABC does not use the insurance (i.e. the cap does not pay 
off). The Contingent Cap is a product that keeps the same cap/insurance payoff, 
but makes the payment for the cap by ABC contingent on certain conditions. 
These conditions can be dependent on the underlying rate itself in such a way that 
ABC essentially only "pays for the insurance if the insurance pays out", or if the 
insurance has a good chance of paying out. In fact, ABC may wind up paying 


nothing at all. Of course the amount SC contCap that ABC pays if a payment is 


required will be more than $C, . 


' History: This was an instructive foray in new product development. The interest-rate 
contingent cap in the text was invented and developed by me in 1991. After an 
unpropitious start, the product actually sold reasonably well after 1993. 


? Acknowledgement: I thank Bruce Fox for his enthusiasm for marketing contingent 
caps, and for helpful discussions on many topics. 
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Essentially with a contingent cap, ABC buys a cap from BD and sells a 
contingent payment option to BD of equal value, so that no up-front cash is 


exchanged. The contingent option, if exercised, produces payment $C from 


ABC to BD. 
The picture for the structure (cap, contingent option) of the contingent cap is: 


ContCap 


BD sells cap, ABC buys cap, 
buys contingent sells contingent 


option option 


Contingent 
Option 


Structure of the Contingent Cap 


The contingent cap is complicated. For a cap, there are a number of resets. If 
Libor on all resets is below a trigger rate K , then ABC pays nothing. Otherwise, 


the first time that L > К on a reset date, ABC pays $C,, and nothing after 


ontCap ? 


that’. We write, with y »1, the payment equation $C сд =%:$C We 


cap * 
generally have y <2 with this product because the composite probability that 


ABC has to pay something is above '^. This follows since many reset trials (all 
requiring "success" from ABC's viewpoint) are needed if ABC “escapes” and 
winds up paying nothing’. 

The picture below gives the 1dea: 


> Connection of Contingent Caps With Barrier Options: The contingent cap is really a 
knock-in discrete barrier option, with the discrete barrier being given by the trigger rate at 
the discrete set of reset times of the cap. See Ch. 17-19. 


^ The Birthday Problem and Contingent Caps: The birthday problem is: *How many 
people are needed in order that the probability is around 50% that two of the people have 
their birthday on the same day". The answer is only 22 or 23 people. The same situation 
exists for the contingent cap. Only a few resets near or at the money are needed in order 
that payment for the contingent cap has a reasonably high probability. 
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Contingent Cap Logic 
Pay only here No payment here 


Trigger 


Rate K 
br Е Contingent Cap requires payment ће 
first and only the first time that Libor is 


above the trigger rate K at a reset. 


Libor reset times 


The value of the trigger rate K can for simplicity be chosen at the break- 


even rate R,, for the swap whose resets are at the same times as those of the 


cap’. The strike E of the cap can be equal to К or different. The contingent 
payment naturally builds in a spread for the BD. There is downside to BD 
because dealing with an exotic requires extra hedging, operational, and systems 
costs. 


A Simple “Contingent Cap” with Independent Caplets (Not This Product) 
The contingent option first originated as a simple single-decision FX option’. 
There, payment is due if the exchange rate 77 is above a certain level 7, and 


not due if 7<7,, with 7, А7 Roughly (ignoring interest-rate effects), 


spot” 
the cost of such an option is double the price of a standard option. This is because 


5 Limitations on the Trigger: If the trigger is far above the swap rate, the potential 
payments get large because the probability of escape is high. This limits the attractiveness 
of the single-level contingent cap. However, the 50-50 contingent cap with two triggers 
can have one trigger at a high level. 
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the probabilities that т goes up or down are roughly ^. Hence 50% of the time 
ABC doesn't pay anything, so when ABC does pay, ABC has to pay double. 

A direct extension of this single-decision option is just a basket of 
independent single-decision options. A simpler product that is just a set of 
independent contingent caplets (see Schap, Ref ` has also been called a 
contingent cap. Multiple payments for caplets in this product are due 1f Libor 
crosses the trigger at various resets”. 

This basket of independent contingent caplets 1s not the contingent cap 
described here. The contingent cap in this chapter is a composite structure that 
cannot be broken up into a simple sum, and requires the payment at most once 
(the first time that Libor crosses the trigger). 


The 50-50 Contingent Cap 
A variation of the contingent cap is a "50-50" contingent cap. There, the single 
level is split into two levels K,, К, . The upper trigger К, is set above the strike 


(K, > E) and the lower trigger Кү below the BE rate (К, < А, ). One payment 
is due the first time L > K, if that happens, and the second equal payment is due 
the first time L > К, if that happens. No other payments are due. The first 
payment, if it occurs, is less than the original cap price $C,,, would have been. 


The cap insurance must start to pay something before the second payment, if it 
occurs, becomes due. 


Scenario Analyses and Total Return Comparisons for Clients 


What else drives the sales of contingent caps? A desk dealing with interest-rate 
exotics at the broker-dealer BD will be speaking to a counterparty X , a financial 
expert in corporation ABC. If the yield-curve has positive slope, X may not 
believe that rates will increase as much as the theoretical statement made by the 
curve’. Still, X wants insurance protection against rate increases. However, X 
does not want to pay anything now. Also X does not want to have to go before 
the ABC management to tell them after the fact that he bought an expensive cap 
that didn't pay off. With the contingent cap, ABC may wind up paying nothing. 


° Simpler Independent Set of Caplets “Contingent Cap”: In this simpler product, for а 
rising forward rate curve, the forward rates that are below Rgg will have large contingent 
premium factors у>>1 because the probability of escape is high. Clients tend to be 
reticent to gamble on large factors, whatever the theoretically small probabilities that they 
might have to pay. In the contingent cap described in the text, there is no large payment 
factor because only the correlated probabilities of escape are used. 


7 Betting Against the Forward Curve: Customer scenarios, which are sometimes 
different from the theoretical forward curve, drive many derivative transactions. 
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Even if ABC has to pay more later, there will be the satisfaction that the cap will 
start to pay off before ABC has to pay. 

Scenario analysis can be performed to show under which conditions the 
contingent cap or its variations (such as the 50-50 contingent cap) will 
outperform or under perform a standard cap. The scenarios can be either 
theoretically based or historical. The historical scenarios involve running through 
time using data. 

In general, the contingent cap outperforms the standard cap if rates do not rise 


above some level RgigRateRise- Otherwise, the standard cap outperforms the 


contingent cap. The decision of ABC to purchase a contingent cap will rely on 
his judgment of future rate rises relative to this level. Scenario analyses are useful 
to help the client decide". 


Contingent Option as a Correlated Set of Digital Options 


We now consider the contingent payment option in more detail. This option as 
we have described it is a complex correlated set of digital options, with lumpy 
cash payments are made contingent on the underlying rate. The contingent cap 
needs to be priced using a MC simulator. Approximate analytic forms can be 
derived which can be used to obtain a reasonably good idea of the final result. 

The risk management of the contingent cap is complicated because of the 
complex correlated digital nature of the payouts. To get an idea, we first consider 
the formalism for one digital option. 


Digital Options: Pricing and Hedging 

To give some background, we first spend a little time discussing single digital 
options. Consider a single digital call option’ with unit payoff at time t if 
x S namely cix ) = e(x = EJ; The payoff can be regarded as the limit 


of a spread of two ordinary call options, one long and one short. 
The idea is shown in the next picture: 


* Total Return Scenario Analyses: A fair amount of time on the desk can be taken up 
running such scenarios. Sometimes clients specify scenarios that they would like to see. 


? Digitals Here, There, Everywhere: The discussion in this section for single digital 
options holds for digitals of any sort — interest rate, FX, equity, commodity, etc. The 
discussion also holds for barrier options that in general can have digital components. In 
this section, we label x as the variable, not its logarithm. 
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Digital Step Option Approximation 


Digital option has strike E , and is 
approximated as a spread of two call options: 
Long at strike E — € , Short at strike E + € 


This is the way the digital step option would be viewed for hedging purposes. 
We would short options at strike E – = and buy options at strike E + =. The 
value of = has to be chosen such that available hedges exist (e.g. 25 bp or 50 bp). 
The idea is rather like electrostatics where we view a dipole with dimension 2e. 

Here is a picture with А and у with E = 0.2, with normal approximation 


mean, variance u=0.2, o = 0.08: 


—a— Delta = Dirac(x-E) —e— Gamma = (d/dx)[Dirac(x-E)] 


0.25 


0.20 


0.15 


0.10 
0.05 


0.00 4 
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A-—N ) 
Ne ae (15.1) 
Mathematically for zero width we have 
A= é(x – Е) 
y -é'(x' — E) (15.2) 


Therefore, A for a digital call looks like y for a standard call. Also y fora 


digital call is the wildly-varying derivative of the Dirac delta function, or the 
smoothed out version of it from the spread option approximation. 


Whipsaw Theoretical Scenario Analysis for Digitals 

A time-dependent scenario to exhibit and drive home the dangerous nature of 
digital options is called “whipsaw” analysis. Here the underlying variable x(t) 
is assumed to be alternately above and then below the strike of the digital option 
E as time passes. In this way, we alternately are closer to long and then short 


options, so the hedging oscillates dramatically. In the electrostatic analogy, we 
see alternately a net positive charge and a net negative charge from the dipole. 


Hedging A Correlated Set of Digital Options 


A correlated set of digital options such as seen in the contingent cap needs to be 
treated numerically. Still, the forward curve relative to the trigger level can be 
used to get a physical picture of the hedging properties. 

The idea is to look at the time when the forward curve intersects the trigger 
level; this gives an average idea of when we can expect reasonable probability of 
the contingent option going into the money. 


Historical Simulations and Hedging 


An historical simulation can be run to test the efficacy of hedging strategies. We 
briefly outline the formalism. Take a portfolio V (x,t) of the security C (x.t) 
and hedge н“) (x,,t,) at a number of successive points in the past 55.) 


with historical data used for x, . The hedge has a superscript because of a hedge 
rebalancing strategy (for example delta hedging), so the hedge function itself 
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(k+l) 


changes. At (х,у) the new hedge H is the old hedge H (0 plus ће 


k,strat : 
change 0H ie Pe viz 


al) (Жы m ) = He (xa > іа ) + а bras ‚а ) (15.3) 


Revalue the portfolio V, =C(x,,t,)-H (9 (x,.f,) successively moving 
forward in time using the time difference d,x, = x,,, — x,. Write the portfolio 


“reval” change between times /, and ¢,,, using the model as 


oy" Ly, -V, (154) 


Using the using the sensitivities from the model for the portfolio, calculate the 
“Math-Calc” change as 


t alc 1 
óy rne = A, d, X; Tight (dix, y *tvega,:d,o *O,dt (15.5) 


Note that we do not replace (d,x, y by dt. This replacement is only 


theoretically possible for the time-averaged or ensemble-averaged quantity, 
provided Gaussian statistics hold. Instead, we just record the results from the data 
for each time step. 

The unexplained P/L or slippage is then the difference between the reval 
difference and the Math-Calc difference, 


У, (Unexplained РЛ.) _ ov, (reval) _ бу, (MathCalc) (15.6) 


The statistics over time can then be examined for the portfolio values (V) , the 


reval changes tap and the unexplained P/L, p e EPA An 


informed judgment can then be made on the relative risk of the hedging strategy 


The Process of Actually Running the Historical Simulation 


Having said all this, it is a major undertaking to run a historical simulation". 
Data must be collected and a front-end link from the model to the data 


10 New Products and Simulations: If management/traders want to roll out a new 
product, they may consider a lot of risk simulations “nice to have but not really needed". 
A conservative approach relies on positive simulation results before approval. In the limit 
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established. Then the simulator must be run, results collected and analyzed, etc. 
The whole process easily may take weeks, depending on the set-up work 
required. 


Yield-Curve Shape and Principal-Component Options 


One can write options on the yield-curve shape. Yield-curve options on the 
difference of two rates are standard. Options on butterflies have also been 
written. We now discuss a more sophisticated example of yield curve options 
based on principal components. 


Principal-Component Options 


Principal-component analysis leads to an equivalent description of yield curve 
movements, with the advantage that generally (although not always), a few 
principal components suffice to describe most of the yield curve shape changes. 
In Ch. 48, App. B we discuss the details of principal components or EOF analysis 
of the yield curve to which we refer the reader. Options can be written on 
principal components". 

For example, we can look at three rates on the yield curve (maturity 2, 5, 10 


years) The third-order "flex" component (DY (# "AD of the yield-curve 


movements (D), for maturity T , averaged over time, as observed in 1992 was 


(ny-"9)-[016(D5), . -0.81(D,r),_,, +0.57 (D)o, | (15.7) 


We use this as an example. In order to write an option on the flex component, 
consider the explicit time dependence of the flex component. For generality, we 
can put in arbitrary weights and maturities: 


yo e (£) = LOT (t) — иаа таме т (t)+ Wrong тош т (*) 


(15.8) 


however, hostile traders may use unending simulation requirements as a negative tactic to 
kill the idea. I was once on the receiving end of this tactic. 


l! History: The idea of writing principal-component yield-curve options occurred to me 
in 1992, or more generally on any linear combination of rates. Averaging was included as 
a feature. These options were called STAR options, short for “Swap Twist Average Rate” 
options. The combination was supposed to be a combination of three swap rates, time 
averaged. The complex yield-curve option idea was followed up and marketed in 1993. 
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The sum of the squares of the weights is normalized to one, у, и? = 1. There аге 
Т 


various possible option payouts. For example, ће payout for а flex call option at 
exercise date /` would read (with W a normalization factor), 


co) (= (и) е) E 


call _ payout 


(15.9) 


To price this option, we can assume that the differential time change of the 


(B=Flex) (1) 


flex principal component d,Y satisfies Gaussian or mean-reverting 


Gaussian statistics. This assumption is probably better than lognormal, because 
the flex can be either positive or negative. Standard option theory can then be 
used to construct the option value. The volatility would be an implied volatility 
taken from the market. The historical volatility for comparison would be 
constructed from the volatilities of the rates along with the correlations, as usual 
for a basket quantity like the flex principal component. 

The approach described above that I developed was eventually published". 


Example of Potential Clients for Principal-Component Options 


The appeal of the flex option can be for clients having portfolios with exposure to 
flexing of the yield curve. For example, insurance companies have SPDA 
products funded at the medium-term part of the curve (5-7 years), so they lose 
money if these rates go up. Their assets are however tied to other parts of the 
curve. For example, they may have longer-term mortgage assets and short-term 
floating-rate assets. In this case, there is some flex risk, which could be 
ameliorated by such options. 


Principal-Component Risk Measures (Tilt Delta etc.) 


We have been describing principal-component yield-curve options designed to 
hedge against various movements of the yield curve. In order to systematize the 
discussion of yield-curve movement risk, a class of risk measures can be 
constructed with principal components". For example, we can define first-order 
quantities “tilt delta” or “flex delta” arising from constructing a tilt or flex 
movement of the yield curve and observing the change in the portfolio value 


? Acknowledgement: I thank Vineer Bhansali for discussions. 


P? History: The idea of principal-component risk measures in my notes dates from 1989. I 
proposed it at various times. 
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under such a movement. Similarly, second-order gamma quantities can be 
constructed. For example, a “tilt gamma” can be defined as the difference of a 
“steepening tilt delta" and a “flattening tilt delta". 

Because the principal components of the yield curve change as a function of 
time, it can be preferable to use stylized fixed definitions in order to separate out 
the risk of the portfolio cleanly. For example, the steepening tilt delta can be 
defined by tilting the yield curve counter-clockwise in a linearly-interpolated 
fashion by a certain amount, while holding one point on the curve (e.g. the 5-year 
point) fixed. These principal-component risk measures are specific scenario risks. 

Various rate types can be used to define these risk measures. For example, we 
can use the forward-rate curve, the zero-coupon curve, etc. 


Hybrid 2-Dimensional Barrier Options—Examples 


The two-dimensional barrier options are single barrier options that are dependent 
on two different variables. They are part of a class of options also called “hybrid 
options”'*. Hybrid options can be formulated in a variety of markets. Typically, 
what happens is that an option depending on one variable is knocked in or 
knocked out at a barrier depending on a second variable. The two variables are 
correlated together. In this section, we give an example. The mathematical 
apparatus ^ is given in Ch. 19. 
The idea is shown in the picture below: 


" History: This work was performed in 1993 while sitting on an exotic interest-rate desk. 
This included producing indicative prices and marketing these 2D hybrid options for a 
number of different pairs of underlying variables, including interest rates, equities, FX, 
and commodities for many quotes. Hybrids were then a new class of options. It was a 
fast-paced experience. 


'S Correlation Risk: One main risk of 2D hybrid options is correlation risk. It is 
necessary to price in some reserve corresponding to stressed values of the correlation that 
could cause difficulty with hedging. 


^ Math Preview: Under simple enough conditions including a constant barrier level, 
hybrid 2D options can be solved in closed form involving bivariate integrals. Monte- 
Carlo simulation is used if complicated side conditions are included. See Ch. 19. 
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Hybrid 2-dimensional barrier options 


KO or KI for option on 
1* variable 
2" variable barrier at H 
2" variable path 


The Cross-Market Knock-Out (XMKO) Cap 

The cross-market knock-out cap, or XMKO cap, is a hybrid product that allows 
ABC to buy interest rate insurance at a cheaper value. The idea is to make the cap 
payout contingent on some event occurring in another market. Because ABC will 
not receive the cap payout under all conditions, the XMKO cap has a lower value 
than a standard cap. The idea is shown below: 


Cross-Market Knock-Out Cap 


ABC buys 
Libor cap 


ABC Broker-Dealer 


ABC sells 


knock-out 
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Roughly speaking the XMKO cap can be thought of as ABC buying a standard 
cap and selling the knockout, thus lowering the cash ABC needs to pay up front. 
ABC for example might have assets that increase in value if (say) the price of oil 
increases. If the price of oil goes up enough, ABC can cover the increased cost 
associated with increasing interest rates and therefore does not need the cap 
insurance. Therefore, ABC may want to buy a XMKO cap that knocks out if the 
price of oil goes above a certain level. 

Clearly, the value of the XMKO cap is highly dependent on the correlation 
between the changes of oil prices (in this example) and the changes of Libor. If 
the correlation is higher, the knock out will be more probable and the XMKO cap 
value will be lower. Because correlations are unstable, some measure of 
correlation uncertainty will be included in the price ABC pays. 

The XMKO cap is evaluated using the caplets in the cap. A separate XMKO 
option for each caplet is independently determined and the results added. Thus, 
independently, the payment for a given Libor caplet will knock out if the second 
variable crosses the barrier, but otherwise the Libor caplet will pay off if Libor 
goes above the strike rate of the cap. 


Marketing XMKO Caps 


One problem associated with the XMKO cap is that the asset and liability sides of 
corporations sometimes do not talk much to each other. Because this product is 
dependent on both the asset side (determining the knock out) and the liability side 
(involving debt issuance insurance), more people have to be involved. 


Cross Market Knock-In Put Option and Correlation Risk 
Here is an example of a cross-market knock-in option". The example is from the 
commodities market 5, a 2-yr UI Al put 20% OTM with a barrier on NG up 20%. 
Naturally, the up-in put is cheaper than the standard put because of the restrictive 
condition on its existence. The details would be tailored to the customer whose 
business depends on both the variables. 

We need to discuss correlation risk. The hybrid nature of the knock-in (KI) is 
strongly dependent on the correlation р, ус between Al and NG movements. 


The value of р, yg, depending on the window, was between —0.3 and 70.4. 


17 Remark: This was indicative pricing for a deal that didn't happen. You get used to it. 


5 Translation: The jargon means that the strike of the aluminum Al put option was 20% 
out-of-the-money OTM above its spot value. The put knocked in (up-in or UI) if natural 
gas NG went above the barrier 20% higher than its spot value. The date for the put 
exercise (if it got knocked in) was 2 years from the trade date. The results given 
correspond to the market in 1993. 
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This large variation in correlations happens in other variables, as we examine in 
other parts of the book (cf. Ch. 37). 
If the put is to exist, NG has to increase enough so that the option knocks in. 


If p, xa > 9, then AI will tend to increase also, making the KI put less valuable 


than if р, лс < where AI tends to decrease. 


Results for the Illustrative XMKI Put Option 
The results for the Cross-Market knock-in put price (expressed as a ratio to the 
standard put) show that the correlation dependence is strong. 

Here is a table of the results: 


Ratio of 2D knock-in put to standard put 
as a function of Al, NG correlation 


Correlation Ratio 
-30% 80% 
0% 66% 
40% 46% 


Standard Commodity Options 


We have been looking at exotic commodity options. The last example had 
aluminum and natural gas. The hedging of these commodity exotics could 
involve standard commodity options. Forward commodity prices are obtained 
from exchanges (e.g. the LME for metals), and traditional option models are used 
for these standard options. 


Reload Options 


In this section, we consider qualitatively some aspects of certain corporate stock 
options called “reloads”, which sometimes have been given to executives and 
other employees"^?. The reload feature is part of the initial option specification, 


? History and Story: I was the in-house corporate quantitative resource for a fascinating 
episode involving reload options that once played a role in an important proxy vote. The 
issue was the reload option value as obtained by a buy-side consulting service that 
advised institutional clients on how to vote their shares. The service made an error using a 
model written by another consultant. Once the mistake was ferreted out, the information 
had a positive impact on the vote. The lesson is that model risk can be significant at the 
corporate level. For other aspects of model risk in pricing executive options, see ref. iii. 
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which gives back to the reload options holder a number of new reload options, 
equivalent to the number of shares needed for the costs of exercise and taxes. The 
reload options that result from exercise are generally granted at-the-money at 
exercise”. Typically, contractual or other restrictions apply" ". 

The idea is shown in the following picture: 


Reload Option Logic 


Option Exercise Option 
granted + reload expires 


In valuing reload options, a successive approximation technique can be used. 
For simplicity, we begin with the approximation that there are no reloads or 
restrictions, and that early exercise is allowed as a standard American option”. 
Then we follow a perturbative approach by adding to the standard American 
option value an extra differential amount corresponding to the extra value for the 


? Reload Strikes: Because the reload options are granted at the money, these options 
have only time value at the time they are granted. 


?! Restrictions of Exercise and Sale: These can include contractual restrictions on the 
time period for vesting of the various options after they are obtained before exercise 1s 
allowed, a minimum value of the stock increase before exercise 1s allowed, restrictions on 
the time before stock obtained from exercise can be sold, etc. Sometimes executives 
make commitments not to exercise options even when allowed. See Ref. iv. 


? American Options basically give a Lower Bound for Reload Options: American 
options allow for early exercise before the expiration date. The reload option (in the 
absence of restrictions) is at least as valuable as a standard American option. The reload 
option holder can exercise at any time, as with an American option, plus extra value 
exists from the reload provisions. Restrictions lower the value to some extent. 
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possibility of reloading”. This extra amount is calculated using a Monte Carlo 
simulator, following the contractual logic of the reload options". Contractual 
restrictions subtract from the value. We generate a number of stock paths into the 
future. On each path, the characteristics of the reload contract are put in just as 
they would be if the stock price were realized in the real world. We then discount 
the extra value of reloads on each path, and average the results over all paths”. 

Although dependent on the details, the reload option without stock sale 
restrictions may be worth around 20% – 30% more than a standard American 
option. For a hypothetical example, if the total term of the reload option is 10 
years, and if a 10-year American ATM option is worth $30, the total value of the 
reload option without stock sale restrictions could be worth $36 — $39. 


Discounts due to Restrictions on Sale of Stock 

The stock obtained by exercising company stock options may be restricted for 
sale by company regulations for a certain amount of time. The idea behind the 
modeling of the restriction discount value is shown in the following picture. 


? Reload Option FAS 123 Reporting: Reload options have been valued for reporting 
purposes as short-term European options using a Black-Scholes model. The options may 
be valued at-the-money, using the stock price Sgran¢ at the time that the options were 
granted or obtained. Reloads that arise from previous option exercises are included. This 
procedure satisfies FASB reporting requirements (ref. iv). The model in the text gives 
considerably larger values. Say at time t; that a given reload option is approximated as a 
European option up to an effective exercise time t*. Then the value of future unrealized 
reloads for t > t* is ignored. This European approximation was not used in the story 
above. Marking the options to market with the current stock price S(to) is a separate issue. 


^ Technical Point Regarding Optimality of Exercise Including Reloads: The 
approach requires some assumption regarding the timing of early exercise with reloads. 
Typically, it is assumed that exercise with reloads 1s performed as soon as allowed. It is 
plausible, but as far as I know unproved, that this is an optimal strategy. The strategy 
allows the option holder to *cash out" on the profit as soon as possible. 


? Effective Paths for Monte-Carlo Simulation: An equivalent procedure is to generate 
effective paths with weights corresponding to the density of paths given by the 
probability distribution function generating the paths. These weights are specific from 
one local bin region in x(t) at time t to another local bin region in x(t + dt) at time t + dt. 
See Ch. 44 on numerical methods. 
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Model for Stock Restriction Discount 


Sell Upside Gain: 
Call Option 


Buy Downside Protection: 
Put Option 


Option exercised End of restriction 
period for sale of stock 


An option including this restriction 1s worth less because the option holder 
gives back some value, since he/she is not able to sell the stock during the 
restriction period. We next describe the model. 


Option Model for Stock Restriction Discount 


A model for this restriction discount is based on option theory. The idea is to 
estimate the hedging cost during the restriction period, such that the risk of stock 
price changes is eliminated, using options. To do this, recall that the no-arbitrage 
option theory tells us that the forward stock price is the market-expected stock 
price. We want to consider the forward stock price at the end of the restriction 
period. 

If at the time the option is exercised and the restricted stock obtained, we 
were to sell a call option and buy a put option, both struck at the forward stock 
price, we would hedge out the stock price risk. This is because we buy protection 
from the put option if the stock price drops. However, we give up the gain from 
the short call option if the stock price increases. If the volatilities are the same, 
these two options have theoretically equal value. However, we must sell the call 
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at the bid volatility and buy the put at the ask volatility. The resulting hedging 
cost gives the amount of the restriction discount. 

Sometimes only a put option is used for the estimate of the restriction value. 
Downside protection by itself (i.e. just the put option) is very costly. However, 
including only downside protection ignores the upside gains. 

As a rule of thumb, for a 2-year restriction, assuming 2% volatility bid-asked 
spread, the discount is around 10% of the stock value. The discounts increase 
with the amount of volatility spread and with the length of the restriction period. 
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16. A Pot Pourri of Deals (Tech. Index 5/10) 


Various securities, options, and risks are discussed in this chapter ". Some are 
exotic and some are not. We consider: 


TIPS, and an Important Statement for the Macro-Micro Model 

Muni Derivatives and Forward Muni Issuance with Derivatives Hedging 
Difference Options (Equity Index vs. Basket) 

Resettable Options: Cliquets 

Power Options 

Periodic Caps 

ARM Caps 

Index-Amortizing Swaps 

A Hypothetical Repo + Options Deal 

Convertible Issuer Risk 


TIPS (Treasury Inflation Protected Securities) 


Description of TIPS 


TIPS, or Treasury Inflation Protected Securities, are Consumer Price Index (CPI) 
- indexed government securities". TIPS were introduced in 1997. Both the 
principal and the coupon increase to offset inflation. The main theoretical 


intuition is that a “real” yield r,,,, plus an “expected inflation" component g is 


eal 


supposed to be the observed nominal treasury market yield for the same maturity, 
Market 


Y = Tea + = Уһ - The TIPS yield is supposed to be 7,,,,, insulated from 


eal ? 


inflation. However the components 7,,,,g are not directly observable. The CPI 


real ? 


' History: This work was mostly performed during 1988-97 (the last section in 2002). By 
the way, pot-pourri does not mean “rotten pot". According to the French Larousse, pot- 
pourri is a production littéraire formée de divers morceaux", if that helps. 
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year-over-year change is taken to give the inflation at a given time’, 
namely g (t): Lyr =| CPI (t) — CPI (t — 1yr) |/CPI(t—lyr). 


There is now a TIPS index". 


N 
The TIPS bond price is $85, = $2: Уи | (c/2)+ бу | . Here, c is the 

i=l 
initial coupon, N/2 years is the bond maturity, 7 = (1 + gAt) / (1 + yAt) , 


At = — yr, the initial notional is $7 , and accrual interest is ignored. Hence 
2 


$Brps =8P-{(¢/2)|(n-n'") (1-1) | +0"| (16.1) 
The TIPS delta price risk is SA В, - 6g. PBs 45), ems 
| ôg |, ду |, 


which depends on inflation and yield changes ôg and Фу, along with the 
correlation? р = p(à g/g,oy/ у). Increased inflation óg > 0 increases the 


TIPS value, while increased yields Ху > 0 decreases the TIPS value. 
Now if a yield change Oy occurs, we expect on the average a correlated 


PET, 


inflation change as 0g = fjóy where p = , and where (0, ,0,,) are the 
, 
lognormal (inflation, yield) volatilities*. 


Noting 07/0 g = – 1/7: дт]/ду , the TIPS delta price risk is approximately” 


2 Inflation term structure: At real time t, there can be an inflation term through a term 
structure g(t ; т) for maturity т. 


? Inflation vs. Treasury Yield Correlation: When TIPS were first issued in 1997 this 
correlation p was surprisingly low, under 20%. The simple theoretical idea is that the 
correlation should be high if inflation changes produce most of the market yield changes. 


* Lognormal Vols and Underlying Changes: Recall that a lognormal volatility is 
defined as сх = sqrt{< (dy/y) >.} so a change dy is on the order of y*ory for some 
representative y. 


? Stochastic risk add-on: In addition to this average deterministic change over time dt, 
there is a noise component if a stochastic model is used for g and y. 
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c 0$B 
$A B, =|1 1 Oy: 581, (16.2) 
пус, ду 


8 


The decreased risk due to inflation is manifest in the minus sign of the second 
term. Nonetheless, the risk is not zero. This means that the duration of the TIPS is 
not zero either. Hence a TIPS, while having more stable behavior than a treasury 
of the same maturity, is not the same as a floater at раг. 

The TIPS yield while relatively stable is not constant. For example, price data 
show that typical TIPS prices ranged around 94 to 103 over the period 4/15/99 to 
7/31/01 (See Jarrow and Yildirim, Ref). 


TIPS and the CAPM model 


This whole idea is rather similar to that used in the CAPM model in another 
context". The above expression for Og is the “parallel” component of the 


inflation change along the Ó y direction. This can be seen by writing p = cos, 


as explained in Ch. 22. An idiosyncratic perpendicular component exists but we 
do not model it. 


TIPS and an Important Statement for the Macro-Micro Model 


One important point in this section is the analysis of Bertonazzi and Maloney" 
based on data from 1997-1999 showing that TIPS yields are relatively stable 
compared to ordinary yields. They stated: “Тһе implication is that inflation is the 
biggest component of variance in the yield on government bonds". 

The Bertonazzi-Maloney statement is in philosophical agreement with the 
Macro-Micro Model’. The Macro-Micro Model, originating from a study of 
government bond data, states that most of the variance in bond yields arises from 
macroeconomic (Macro) causes, which naturally would include inflation. 
Macroeconomic effects other than inflation could still be present as trends in 
TIPS on the time scale of months. 

A smaller highly mean-reverting and rapidly fluctuation component also 
exists, arising from trading (Micro) effects. 


° Trader's Intuition? Before the TIPS were issued, one influential trader forcefully 
promoted the idea that the TIPS would remain at par, and therefore have zero duration 
and no risk. The first week of trading sobered things up a bit. However, the TIPS are 
indeed relatively stable. 


7 Macro-Micro Model: This model is treated in detail later in this book, Ch. 47-51. 
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Municipal Derivatives, Muni Issuance, Derivative Hedging 


Muni Derivatives 
Muni derivatives are treated with formalism similar to that of Libor-based 


derivatives. The main assumption is that the Muni forward rates fO, at 
different maturities T are fractions a <1 of Libor forward rates X. viz 
fi), = gU. fU) . The fractions can be written as (0 =1—r) , with a 
"forward tax rate" 9 

generally exempt from federal income tax, and possibly exempt from state and 
local taxes. 

Muni swaps and options are defined similarly to Libor swaps and options. 
For example, the floating rates in a Muni swap might be specified as obtained 
from the J. J. Kenny municipal short-rate index. 

The Muni rate risk can be found by writing the change in the muni forward 


rate composed of the two terms S fC), = € -8 f) Log . fU | The term 


. The idea of course is that municipal cash bond interest is 


with dé) gives the risk due to changes in the Muni/Libor ratio. 


Muni Bond Futures and Options, and Comparison to the Treasury Market 


Listed muni bond futures and some options on muni futures exist". These are 
relatively illiquid compared to the treasury market. Relative strategies between 
the muni and treasury markets are often used, under the rubric MOB (“Municipal 
Over Bond") strategies. The pitfalls include the breakdown of the correlation 
between the two markets, which are driven by different dynamics. 


Treasury-Bond Option Models 


Muni-bond-future options models are similar to treasury-bond option models 
based on the Black-Scholes model. Some details include the use of an appropriate 
repo rate to get the forward bond price. This is because repo is used for funding 
the bond; the forward price minus the spot price is the interest on the repo loan". 
A "'risk-free-rate" is used for discounting (generally spread at or above the repo 


* Muni Futures and Muni Options: The muni futures are settled in cash as $1,000 times 
the Bond Buyer municipal bond index. The contracts are Mar, Jun, Sep, Dec and are 
traded on the CBOT (Chicago Board of Trade), but only the nearest future is liquid. Muni 
futures options are American style short-dated options. The Bond Buyer 40 index 
includes 40 high-quality actively traded tax-exempt revenue and GO bonds; the 
composition is changed from time to time. 
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rate). See footnote for details’. The use of this simple model here is reasonable. 
This is because the bonds are insignificantly affected by the “pull to par" at 
maturity. To a good approximation, an American feature gives a small correction 
to the European result. This correction can be expressed as an equivalent change 
in the implied volatility of the option, and is generally only a fraction of one 
percent. 

Short-dated muni-future options can similarly be reasonably priced using a 
modified Black model, using the underlying as the price of the muni future". 
Again, the bonds on which the future is based for delivery have much longer 
maturities than the short-dated option, so the pull to par is negligible. 


Example of Forward Muni Issuance With Derivative Hedging 

Here is a hypothetical deal involving municipal bond issuance including hedging. 
Muni rates have dropped. The ABC municipality has an outstanding bond that it 
would like to refund given the savings implied by current low rates, but for 
various reasons ABC cannot call the bond. Therefore, ABC enters into a deal 
with a broker-dealer BD. The BD agrees to take the risk to underwrite an 
ABC bond issue at time £,.., in the future, with the constraint that АВС °ѕ 


Issue 


yearly debt service $05, in each year f, will be specified now. Here'', $05, 
is the principal re-payment $P, plus the interest payment $/, on the unpaid 
principal for coupon c,. The amount $05, is advantageous from ABC 's point 
of view, giving some compensation for the current low rates. The coupons c, 


will be set at issuance f, 


issue > 20d will be fixed (not depending on a floating rate). 
The analysis of the risk of the deal from BD ’s point of view depends on the 


changes of muni rates between now f, and f,...,. If muni rates do not change, say 


Issue 


that BD makes scm . If rates rise, in order to sell the deal to investors, 


the required coupons will rise, increasing interest payments. Since the ABC 


? Treasury-bond Option Models: A modified Black-Scholes model gives a good 
approximation to short-dated options on long-maturity bonds. A “dividend yield" is set as 
the annual coupon divided by the bond price. The clean bond price (without accrued 
interest) is used as the underlying variable. Other models are yield-based, but are very 
close in value to the price-based model. 

It should be noted that bond prices may be quoted as 100.nmr. Here nm - 00 to 31 
indicate fractions of 1/32, and r = 0 to 7 indicates fractions of 1/8/32. 


10 Black Option Model: This formula, appropriate for an option on a future, can be 
obtained from the Black-Scholes model by including a fictitious “dividend” set equal to 
the risk-free rate, since the future 1s not an asset. 


" Time-Dependent Coupons and Principal Repayments: We keep the notation 
general. 
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total debt service is fixed, the notional principal $P, for the bond will have to 


decrease. Since BD has underwritten the deal at a fixed total amount $P, 


eal ? 
BD will lose money if rates rise enough, i.e. such that the profit or loss “P&L”, 
is negative, viz $P,,, = $P, -$P u - $SCOV C""*? <0. On the other hand, if 


eal profit 
rates decrease, then the principal $P, increases, and then BD makes more 


profit. However, BD does not want to take a view on rates and therefore wants 
hedges the exposure to any rate change. 


Hedging the Issuance 
Possible hedges for the issuance include: 


e Go Short Muni Bond Futures 
e Go Long (Buy) Libor Payer's Swaption (and sell other swaptions) 


The first possibility is to go short muni-bond futures. The short muni-bond 
futures hedge will increase in value as Muni rates increase, and the number of 


contracts can be arranged to stabilize $P,,,, as determined by running various 


scenario analyses. The futures need to be rolled over if Її, is past the next 


futures expiration date. We still have the basis risk Basis = Cash — Future after 
putting on a futures hedge. Here the cash is not the Bond Buyer index on which 
the Muni bond future is written, but the ABC issuer bond Cash,,... The total 


basis risk is thus the usual (Index vs. Future) basis risk 
Basis, = Index — Future plus an idiosyncratic (Cash vs. Index) basis risk 


Basis с, = Cash ве — Index . 


The second possibility is based on swaption hedging. The payer’s swaption 
hedges the increase in rates by paying off more when rates increase. Now buying 
a swaption has an initial cost. This cost can be reduced or made zero by selling 
other swaptions, for example selling a payer’s swaption at a higher strike and/or 
selling a receiver’s swaption. The swaption parameters would be obtained by 
achieving a measure of stability of $P,,, using scenario rate analyses, along 


with basis risk assumptions. In this case, the basis risks are the Muni/Libor risk 
and the idiosyncratic cash risk. 

There is also credit risk. If the credit of ABC drops before the time at 
forward issuance, the coupons that are needed to make the deal sell to investors 
will need to rise, also reducing $P,,,. An estimate of this risk would be built 


into the nominal profit $C, and a credit reserve might be taken against this 


rofit > 


risk. 
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Difference Option on an Equity Index and a Basket of Stocks 


This section discusses a classic European call option with payoff related to the 
difference of an equity index, e.g. Szo (S&P 500), and a basket of stocks Sp. So 
we need to consider the difference variable S59) — $5. We can think of this 


option as a call option on the index, where the strike of the option is not a 
constant but equals the basket price. Going long this option hedges against 
underperformance of the basket relative to the S&P benchmark. Thus, if the 
basket outperforms the S&P, the call is worthless but the basket beats the 
benchmark. On the other hand if the basket under performs, the call is in the 
money and the payoff reduces the underperformance of the basket. 

The formalism” amounts to a straightforward application of the two- 
dimensional path-integral projected onto one dimension". Here, we skip the 
formalism and only discuss the results and some intuition. 

Assuming correlated lognormal dynamics for the index and basket, the 
difference call option is found to be"? 


C= S6 7 N| soy ] - $5677 N[®, | (16.3) 


Here S (spot S&P 500 index), $ в (Spot basket value), T = t | —t, (time to 
expiration), узуу (S&P dividend yield), ур (basket dividend yield), с; (S&P 
volatility), o, (basket volatility), (correlation of S&P, basket returns), 


Фе» [а E) 81 4^ T]/ JA Т, where & =($ 679 7)/(S,e77). 


Finally, the composite variance (the square of the total volatility Л ) is 


A = уе =2 96. 9; (16.4) 


Intuition for the Difference Option Formula 

It is not hard to see why the option formula makes sense. The payoff at expiration 
f is the difference of the index and the basket, if that is positive, viz 
ET = 8. Є Hence the variable 5 acts like the strike at expiration and the 


basket spot price s в acts like the strike at the current time. The formula for the 


difference option С has a Black-Scholes form with standard normal-function 


? Path Integral Formalism: See Ch. 42 and 45. 


? Homework: You might like to derive this formula. 
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arguments. The discount factors exp(-rT ) are cancelled by forward price 
factors exp(rT ), occurring for both the index and the basket, and leaving the 
dividend-yield factors exp(—yT ). The term —2 054, 0, in the composite 


volatility A has a minus sign because S59) — S, also has a minus sign. So you 


see the formula makes intuitive sense. 
The hedge ratios Мс and N, for the S&P and basket can be found by 


considering the portfolio V = C + N wS. 


5009500 + M, 5, and following standard logic 
as explained in Ch. 42. 

In the previous chapter we discussed yield-curve options, simple forms of 
which resemble this option. 


Also, it is worth noting that the connection with options on the maximum of 
two assets is found using the formula [4 — B]. = max (A, B) — В, a case 


originally treated by Stulz. The generalization to many variables was done by 


Johnson". 


Resettable Options: Cliquets 


Resettable options are another type of option whose strike is not fixed. They are 
generally baskets of component options. Note that a “basket of options" is 
different from an “option on a basket", so we are talking about sets of options, 
not options on a composite variable. The component options have the feature that 
their strikes are unknown at the start, and are dependent on the behavior of the 
underlying variable at fixed times in the future. That is, the strikes are “reset”, so 
these options generally could be called “resettable options". We have already 
looked at one example, the reload option, in a previous chapter. 

“Momentum” caps are a set of interest-rate options that have resetting at-the- 
money (ATM) strikes at particular times. The holder of a momentum cap 
receives the rise in the interest rate from one reset to the next. 

A “cliquet” is similarly a set of component options “cliquetlettes” expiring at 
successive times, formulated for equities or FX, and again generally reset ATM. 
Various bells and whistles are sometimes added to cliquet deals, including 
guaranteed returns, participation factors, strike percentages, and also cheapening 
the structure by replacing the component call options with call spreads. Cliquets 
can act as hedging vehicles for multi-year equity annuity programs that are based 
on equity returns, e.g. programs offered by insurance companies. 


Pricing Resettable Options 


We can get an exact solution for sufficiently simple deals. We use the language 
of stocks with stock price S(t). If the strike Е of a component option expiring 
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at і is set ATM at the intermediate time f Е < f then E=S к. Hence the payoff 

of a call option at expiration is proportional to (5 UE z) . In order to value the 
+ 

option today ¢, we merely need to write the discounted expected value as the 


succession of two propagations in the two intervals (fotr) and (r 


multiply by the payoff, and do the integrals. With the usual lognormal dynamics, 
these integrals are simple and closed-form solutions result. 
The picture for a component of a generic resettable option gives the idea: 


Resettable option expiring at г has strike E 
set at time t, as the value S, , at-the-money. 


The form of the integration we need, with x = In S , is 


£u - j dx, j | exp(x"] — exp(x,) | Соо) ы Jae 


ТЕ 


(16.5) 


Here the discount factors and normalization are included in ће Green functions. 
Since by assumption the Green functions are Gaussians in the x variables, we 


246 Quantitative Finance and Risk Management 


can do the integrals by completing the square using ordinary calculus. After a 
little algebra we get (with Ty =t,—t, and E =f -t ғ) the result’ 


C(x,.t,] = sere [evn (4..) Е e" N(a, | (16.6) 


Here ds, =(т-у+®о*/з)г, f eod; Also (ус) are the discount 


interest rate, dividend yield, and volatility respectively. Note that the bracket 
multiplying S, is independent of S,, and is in fact the expression for an option 


over time interval Pa with price and strike equal to 1. 


Realistic Cliquet Deals 
Cliquet deals, as mentioned above, are actually more complicated. We can have a 


guaranteed minimum return A, 


in ? 


participation factors {4 zi , persistence factors 


max 


A, =| S, -E Spa |/ 8; and the /" notional is $W, = z,$P. The /^ 


{т " , Strike percentages el. and a maximum return 2. The //" return is 


component cliquetlette payoff equation at f is defined as" 


$C, =$., max ET min ( 2, R ax )] (16.7) 


€ 


Defining the effective strikes E o = $, [1 tA | 4 А апа 
ES = 6$, [1 + Жык] along with the time periods Tt, = t —t, and 
t; = f — D , the result for the expected discounted payoff is! 


—T, 
C,(x,5)-6 4 Rigg + — [с -c | (16.8) 
0 


" Homework again: You can do the integrals, just try. 


'S MaxMin: You need to go through the payoff equation one step at a time starting from 
the inside expression to see the logic. 


^ First Cliquetlette: The first component option in a cliquet deal already done needs a 
separate calculation since its strike has already been set. 
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Note that the difference of C т апа С (В) defines the call spread С (4 -C e, 
The options have standard Black-Scholes forms, 
С(® = Se" N(d,,)- EON (d1) 
- Se s w(a, )- gl 
1, =|1 a(s, / E | (r-y+0*/2)z,|/(oyre} 
4, =|1 n(s, Е) (r- ve). odr] 


ГЕ 


(16.9) 


If the rate ris taken from a term structure, the rate multiplying /, in the 
forward stock price is the forward rate between Г, апа f; . Similar remarks hold 
for the dividend yield and the volatility. The rate multiplying 7, in the discount 


factor is the г, spot rate. 


Example of a Cliquet Deal 
In this deal, there are seven components or “cliquetlettes”, with annual resets. 


=2%, Жы =10%. The 


max 


The minimum and maximum returns are 20, 
participating factors and persistence factors are A, —7z, =1 and the strike 
percentages are ¢, = 100% for simplicity. The rates are r = 5%/yr , dividend 
yields y =1.3%/yr both continuous/365, and the volatility for the C options 


is o =40%/ Jyr . We put in some vol skew for the C (P) options, 
ÔO skew |с = —10%. The notional is $P = $100MM . We pay at the end and 
reinvest the payments for each cliquetlette at a reinvestment rate r = 5%/уг. 
The effective strikes are E; (4) Я S, = 1.02, or 2% out-of-the-money (OTM) for the 
long A options, and EO / S, =1.10 or 10% OTM for the short B options. 

We get the ш 7-year cliquet as the sum of the cliquetlettes 


(xt , D» $ C,( (35.4 , We get $ C скае 


$C 


Cliquet 


= $39.5 ММ. The 
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components are $C... = 11.5 MM for the Р.У. of the guarantee from 
Ruin and $C. = $28 MM for all call options. 


We can do a quick sanity check". At average 3.5 years, the discount factor 
with continuous compounding is 0.84, so the guarantee should be worth around 


0.84*7 yrs*2%/yr*$100MM, which checks $C Guarantee - The average per year for 


the call spread is $283MM/7 = $4 MM, or about 4% on the average for the 1-year 
call spreads. We find $16.36 for a 1-year option 2% OTM and $11.74 at 10% 
OTM, or $4.62 for the first call spread. Since this already includes one year of 
discounting, we multiply by 0.88 to get the average call spread, which checks 


$ C ca, k 


Power Options 


Power options have payoffs proportional to a power of the difference between the 
underlying variable and the strike. For example, we can define an interest-rate 


power floorlet (PF) with power 4 21. Call f " the forward Libor rate at option 
expiration t. If the floorlet has strike E , we write the power-floorlet payoff as 
$C}, = sk(E- ^y oE- f] with $K = $N dig E . Note that 
the payoff has units of $money only. 

We can find the power floorlet value $C pp ( dts) today f, with the spot 
rate f}, assuming for example standard lognormal dynamics for the forward rate 
changes d,f (t). We first expand $C, in a power series in f " (which 


truncates if A is an integer), viz 
"VI = PE A-k A 
(E- f ) =} (—/ ) (Е) А) (16.10) 
k=0 


Now we use the standard discounted expected formalism corresponding to 
the lognormal Green function. Define x =In f i X, = In fo, volatility с, time 


to expiration tr =f —t,, and write бу =[In(£/f,)+ 0°" 2] ev). 


After a little algebra, we find that we need the integral /, where 


17 Sanity Checks: Yes, you have to do sanity checks. For one thing, you want to have a 
one-liner to explain the results to the management. Also how do you know that your C^ 
programmer with only two months of experience on the Street hasn't screwed up? 
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І, = ЕЕЕ с?т" /2)N | Sox - ког | 


(16.11) 
This is enough to get the result, 
$C on (7.0) 2 $K- p. У) (Е) (ein -о?г' fo), 
І (16.12) 


Here P is the appropriate discount factor. 


Power-Barrier Options 


It is possible to calculate barrier options with power law payoffs’? by using the 
barrier Green functions, as described in Ch. 17. We leave this as an exercise. 


Path-Dependent Options and Monte Carlo Simulation 


This book has emphasized calculations that can be done at least partially 
analytically. The workhorse calculations that cannot be handled analytically 
generally are handled using Monte Carlo (MC) simulation. We do not spend 
much time on MC methods in this book, but we will mention a few applications 
here. They are: 


e Periodic Caps 
e ARM Caps 
e Index Amortizing Swaps 


Periodic Caps 


A periodic cap is a collection of resettable caplets, but now including spreads. 
The caplet with expiration date b has its strike set at the rate 2 at the previous 


Inverse Power Options: We can also have negative powers if we keep the underlying 
from approaching zero with a barrier. 
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reset date P , plus a spread 5, ,. The payoff is C / = [Р -r — 5 " where 
+ 


7 is the final rate at A Other types of resets have been proposed, including 


viii * Раћ(а) 


look-back minima"". The pricing simply involves evaluating С / on each 


path(a), applying the discount factor appropriate to that path, and then 


averaging the results over all paths’”. 


ARM Caps 


ARM Caps reflect the dynamics of the adjustable rate mortgage from which these 
options get their acronym. The payoff of one of the component caplets C Ё аї 


time 7 involves the difference between the Monte-Carlo (MC) rate T; and a 
calculated rate p ie . There is a notional $., and a time interval Af (e.g. 


6 months). The result is $C 7 =$N, In — puse At. Again we discount 
16. Е 


the cash flow С for each path and average the results over the paths. 


The calculated rate functions like an ARM rate paid by a homeowner. It 
depends on a variety of complications including initial teaser rates, lifetime 
ceilings and floors for rates, and maximum allowed jumps up or down for rates. 
The calculated rate therefore depends on the path, including the various rules 
depending on the deal. Most of the work involves implementing the logic for the 
tules. The simulated MC rate plays the part of the market rate on which the ARM 
is based, and does not have anything to do with the rules. 

Because the homeowner is long an embedded option like an ARM cap in an 
adjustable rate mortgage”, combinations involving ARMs and ARM caps 


packaged together can be made free of options””. The notionals (Sv т} аге Шеп 
chosen as to reflect the anticipated amortization for the principal of the ARMs in 


the deal. This combined structure may appeal to investors who do not want to 
take option risk, but who may get extra pickup on the yield. 


? General Statement for Monte Carlo — Boring But Brawny: Get the cash flows оп 
each path and discount them. Then average the results over paths. This brute-force 
method is really boring and inelegant, but it's the only way to fly for many situations. 


? ARM Floor: There is also an embedded floor that the homeowner is short, reflecting 
the possibility that rates drop below the amount paid by the homeowner. 
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Index-Amortizing Swaps 


Introduction to Index-Amortizing Swaps (IAS) 


Here we consider an “exotic” swap called an indexed-amortizing swap^'* (IAS), 
also called an indexed-principal swap (IPS). In the section on swaps, we said that 
a different notional $N, for each swaplet could occur. If these notionals 


decrease, the swap is called an amortizing swap. For the IAS, the swap amortizes 
according to a schedule fixed in advance. The schedule consists of an association 


between possible levels of an index floating rate F;,,,. and the amortization of 
the swap. The index rate Р, is generally not the floating rate f;,,, in the 


swap. The notionals (SN, * of the swap depend on the path that the index rate 


takes as time progresses. There is an initial “lock-out” period when no 
amortization occurs. The picture gives the idea: 


Index-rate paths and the amortization schedule levels 
for the Index Amortizing Swap (IAS) 


No amortization this path 


Yes amortization this path 


Lock out. No Index rate levels in the IAS schedule. Each level 
amortization defines an amortization % for the notionals in the 
in this time swap, if the index rate drops to that level. 


?' Acknowledgement: I thank Alex Perelberg for helpful discussions on IAS, and on 
many other topics. 
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We write the notionals of the swap for a given path, of the index rate F, gex 


as $N, =$N, | Finder (path, )) that amortizes according to the schedule. The 


swap along that path can be written as 


SSF, (pu, = 9: SN [Fa pun. )] (70) в), Р), 
(16.13) 


The entity ABC enters into the swap with the broker-dealer BD, receiving 
fixed and paying floating according to, say, Libor. A typical IAS deal diagram 


looks like this: 
Index-Amortizing Swap (IAS) 


ABC pays floating in the 


swap, amortizing notional 
equivalent to option to BD 


wauuuuuuuuauuuuu) 


ABC 


— > 
Broker-Dealer 
яр — 


ABC receives fixed + 


extra spread in swap 


The extra spread and option are discussed below. 

Here is an example of an amortization schedule, with the change in index rate 
given in bp from its starting value at the beginning of the deal, and the 
amortization applied at each reset: 
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Index Rate Amortization/yr. 

300 1% 
200 2% 
100 5% 

0 15% 
-100 20% 
-200 30% 
-300 35% 


Connection of IAS with Barrier Options 


Here is another schedule, just equivalent to a simple barrier knockout option if 
the index rate goes down 100 bp, at which point the notional would disappear: 


Index Rate | Amort/yr 
0 0% 
-100 100% 


Analysis of the IAS and its Embedded Options 


If the index decreases enough to cross one or more levels of the schedule, the 
swap notionals increasingly amortize. ABC is receiving fixed including an extra 
spread and paying floating. If the index rate is decreasing, because of the 
generally positive correlations between rates, probably the floating rate that ABC 
is paying to BD is also decreasing. Hence the swap becomes in-the-money to 
ABC. Clearly ABC likes this arrangement. The amortization however is making 
the swap go away, to the benefit of BD. If the index rises, the BD does not want 
to lose the extra floating payment from ABC. 

So, effectively, there is an embedded option in the IAS because ABC has 
given an option to BD to partially stop the swap. This option is, from BD’s point 
of view, something like having a receiver’s swaption, which gives BD the right to 
enter into a swap to receive fixed and pay floating. However this swaption is 
path-dependent, since it depends on the way the index is behaving with respect to 
the amortization schedule in the IAS. 

In order to compensate for this embedded option that ABC gives to BD, ABC 
gets an extra spread in the swap. 


Example of Application of the IAS: Mortgage Servicing Hedging 

One application of index-amortizing swaps is to provide a hedge against 
prepayment risk for mortgage servicing". As rates drop, homeowners 
increasingly prepay their mortgages, and a mortgage servicer ABC loses some 
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business on which it is making the servicing spread. ABC therefore wants a 
hedge that increases in value as rates drop, a floor-like product. Now the 
servicing spread is calculated including anticipated prepayments. The servicer 
pays for a hedge with a schedule consistent with these anticipated prepayments. 
The amount of the hedge decreases as rates drop since the amount of mortgage 
collateral decreases. The IAS provides these features. 

The swap notional amortization depends on the index. For example, the index 
can be the 5-year Libor swap rate, a CMT rate, the prepayment on some GNMA 
mortgages, etc. If the index is tied to the mortgage rate driving refinancings and 
prepayments, then the IAS can work as a mortgage-servicing hedge. 


Pricing the IAS, a Volatility-Dependent Swap 


It is clear from the description that a Monte-Carlo simulator is needed to price the 
IAS. The model in principle needs to simulate the index rate paths, and have a 
mechanism for correlating the swap floating rate with the index in order to price 
the swap. 


No Simplification for IAS 


Naturally, the IAS depends on the rate volatilities used in the MC simulator 
because these determine the probabilities that the paths will wind up crossing the 
schedule levels and changing the amortization. 

Note that even if the notionals are independent of the swaplet index /, the 
IAS cannot be written as a bond and FRN. This is because the floating payment 
notionals depend on a stochastic index and are thus not constant, as needed to 
produce the FRN = par equivalence at reset. 


Sociology 

Typically, the quants will write the code and then deliver it to the systems people 
who put it in the system. Deals will be brought by the sales group to the desk 
where the trader or desk quant uses the system to price the deals”. 


A Hypothetical Repo + Options Deal 


This example exhibits a hypothetical repo transaction with side conditions. For 
the deal, we do not have to know much about геро“". The risk involves a forward 
determination of repo specified in the transaction that is hedged in the ED futures 


? History: Index amortizing deals are now standard products that started around 1991. 
My activity (in two different jobs) was pricing them on an exotic interest-rate desk, and 
later supervising a quant group producing the numerical pricing code. 
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market. There are also max/min constraints on the repo rate that amount to 
embedded repo options, and that are hedged in the ED futures options market. 
The main risk of the deal is the risk due to the uncertainty in the repo rates. 


Definition of the Deal 
The broker-dealer BD and investor X make the following arrangement. 


1. BD pays repo to X at an agreed-on repo rate po 
the next ED future-contract IMM date. 


2. BD pays a “tail” repo rate n from the next IMM date to deal maturity. The 


for a period of time up to 


tail rate is defined as the p 


(i.e. plus the change in the ED future rate). 
3. Maximum and minimum repo rates are specified by the contract. These 
specifications amount to embedded options in the contract; one can think of them 


minus the change in the nearest ED future price 


goes outside the boundaries. 


to BD. 


as contingent cash flows if rp 


epo 


4. [n return, X pays a fixed rate E 


Fixed 


Motivation and Risks for the Investor X 


The motivation for the investor X can be that X thinks rates are likely to go up in 
the near term, or perhaps that X has some internal risk of rates going up and 
wants to hedge against that risk. 


Motivation and Risks for the Broker-Dealer BD 


The diagram summarizes the cash flows and options in the deal”. 


? Repo Transaction Cash-flow Diagram: The arrows indicate the cash flows. The BD 
pays the repo rate to X. This rate changes at the next IMM date. If the ED future rate on 
which the contract is based goes up (down), BD will pay a new repo rate up (down) by 
the same amount. This is called the “tail” rate because it is at the tail end of the deal in 
time. There are two options embedded in the contract, a repo cap and floor. The BD is 
long the cap and is short the floor. The repo rate paid by BD is limited by the repo cap 
(below the maximum in the contract). The repo rate received by X has the contract 
minimum provided by the repo floor. The combination of a long cap and short floor is 
called a collar. Finally, the BD receives the fixed rate Esxeq from X. 


256 Quantitative Finance and Risk Management 


Hypothetical Repo Deal with Options 


Repo (reset at IMM) — 
Broker <+—| Repo cap contingent Investor 
Dealer X 
BD 
Repo floor contingent | —» 
*— | Fixed Rate E. s 


The deal can be attractive to BD if the r (3? 


Repo 15 low enough relative to E 


Fixed * 


The BD pays repo 7,,,, and receives the fixed rate E so the BD can make a 


epo Fixed ? 


profit through the “Carry” = Ej, —r,,,,. The main risk of the deal is the repo 
risk p in the tail period from the next ED future IMM date to the deal 


maturity. Scenarios would have to be run for variations in the ED future 
movement to assess the risk”. 


It should be clear that this looks like a swap” (fixed E Fixed against floating 


Trepo ) With some extra options. In fact, the repo market mirrors the interest-rate 


? Risk Scenarios: The risk in the deal can be assessed by calculating the results 
assuming various changes in the ED futures that in this contract determine the tail repo 
risk. This can be done by looking at historical ED future data movements at some 
confidence level. The analysis is done in time windows from the transaction date to the 
next IMM date. Alternatively, a scenario based on "judgment" can be applied. Given the 
change in the ED future, the risk of the deal is assessed. We will have a lot to say about 
risk measurement of this sort later in the book. 


? This Repo Deal is a Basically a Short-Dated Swap: Sometimes a repo desk 
resembles a short-dated swap desk. The diagram in the text has been drawn to emphasize 
the repo dependence. By redrawing the picture to make the fixed rate “leg” (arrow) at the 
top of the column of arrows, the top two arrows look like a swap with BD receiving fixed 
and paying floating. The floating repo rate here is, however, not determined by the repo 
rates in the market at the reset date, but by the deal contract specifying the repo reset via 
the ED futures market. 
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derivatives market in many ways, the main difference being in the short-dated 
maturity of the repo transactions. 2° 


Hedging the Deal 


The hedging of the deal by the BD consists of two parts: 
1. BD shorts ED futures in order to hedge the repo reset risk”. 
2. BD purchases and sells appropriate ED futures (“pit”) options to offset the 
repo cap and floor". 
The following diagram summarizes the hedges put on by the BD. 


BD Hedges for Hypothetical Repo Deal 


BD shorts ED futures —_> 
Broker- 


Dealer BD sells ED put options | ————» 
BD 
«—— | BD buys ED call options 


26 Repo these days can even be negative. On 3/21/14, overnight repo GC closed at zero. 
Specials in the 2s and 5s traded between -20 to -25 bps. (Source: R. Briggen, 
ThomsonReuters news report). The repo deal in the book was a long time ago when this 
sort of behavior was not even dreamed of. 


27 Hedging the Tail Repo Risk with ED Futures. BD is receiving fixed and paying 
floating, and therefore BD loses if rates go up. BD is “long the market”. To hedge, BD 
has to go “short the market”. BD shorts an appropriate number of the nearest ED future. 
This hedge makes money if the price Ps of the ED future decreases (i.e. if the ED rate 
гер goes up). This offsets the risk to BD of the potential increase in the repo rate. 


?* Hedging the Repo Options with ED Futures Options or Pit Options. The BD also 
performs some hedging transactions involving ED futures options (“pit options"). These 
options involve the same ED future as in the deal. The BD does this in order to flatten out 
the vega of the embedded repo cap and repo floor. In this way the overall desk portfolio 
will not have different vega after the deal is done. First BD sells ED puts with strike Eput 
which pay off if Р,< Epu or if rep > 100 - Eput which offsets the long BD repo cap. Then 
BD buys ED calls with strike Esan which pay off if P; > Esan or if rep < 100 — Еу which 
offsets the short BD repo floor. However these hedges are not complete. The strikes 
ideally should be chosen so that one of the options pays off when a repo boundary 
defined in the deal is hit. However the ED pit option strikes are only available at certain 
discrete levels, so there is some residual risk of mismatch. 
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Convertible Issuance Risk 


In this last section, we deal with risk involving the issuance of convertible bonds. 
The risk discussed here is a stock gap-risk. The issuer ABC of the convertible” 
is often not investment grade, and the stock may not be traded in large volume. In 
particular the stock price can drop before the deal can be sold and/or hedged, 
leading to losses by the broker-dealer underwriter BD who has agreed to buy the 
bonds from ABC to sell to investors X. The problem is acerbated by the fact that 
if the stock price drops, the deal will not sell well at the original price. 


A Risk Model 


We first calculate the number М 


Iss 


of shares of stock needed to short to hedge the 


issuance of №, „л. (actually, only the part of the issue that has not yet been sold). 


Generally, one convertible bond can be converted into a number of shares called 
the conversion ratio С. We need a convertible-bond model that gives the 


convertible bond price C as a function of (among other things) the stock price 
S . For a given move in the stock price OS, the number of shares needed N,. is 
determined by the convertible-bond delta A = 2С/ 5 and to some extent by 
gamma y = CCS. Equating the change in the hedge with the change in the 


convertible, N, 


SS 


is determined by 


N 655 =N, "(A08 * y -6S?/2) (16.14) 


bonds 


We need the historical data for the daily volume N,,,,, for ABC stock trading 


as a function of time. If that information is not available, we can use a proxy 
stock or equity index closely related to ABC. For simplicity we can first assume 
Gaussian statistics for the relative stock price changes dS/S, requiring the 
average ( Avg ) and the standard deviation (с). 


We choose a confidence level CL" for the daily volume N,,,,,. There is an 


aily 
associated parameter of the number of standard deviations А,“ (e.g. kL?’ = 1.65 


at СІ! = 95%). Then we write 


N ау (к) = Avg[N, 


айу 


]- Ke ОМ us] (16.15) 


The reason for the minus sign is to be pessimistic regarding the number of 


shares Л M available for hedging in the market. It is bad if there are 
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fewer shares because then it takes more time to hedge. The amount of time 
At,, needed to hedge the deal with this scenario is: 


At, = № 


Iss 


INS (к) (16.16) 


We assume some confidence level CL°’“ for stock price moves with 
associated parameter А2 in the time At, (e.g. 95% CL again). We need to 
obtain $$ (kp ; At,,) from downward moves $5 < 0 observed historically in 
time interval Лі, at the confidence level for down moves chosen, С^”. 

The loss $Loss in this pessimistic scenario is the number of shares needed to 
hedge times the drop in the stock price during the time At,, needed to put on the 
hedge. This is divided by two, because on the average during the total hedging 
time Af, , half the deal will be hedged, so the net loss is halved. We get 


$Loss(kr? kev"; Aty) = Ny SESE" ; At, )/2 (16.17) 
Any sale of more bonds to customers will ameliorate this potential loss. 


Preview of Coming Attractions: Math Coming Up Next 


We have now gone over a number of deals and case studies in the book. The next 
few chapters go into the theory of exotic options in some depth. 
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17. Single Barrier Options (Tech. Index 6/10) 


In this chapter, we begin the discussion of barrier options using the path integral 
Green function formalism". Barrier options are options that have the underlying 
process constrained by one or several boundaries called barriers. Usually these 
options are European options—that is, there is only one exercise date at which 
the option may pay off. If the underlying crosses a barrier, a “barrier event” 
occurs, and the option changes its character. For example, the option may 
disappear’, be replaced by another option’, be replaced by cash", etc. The barrier 
is usually continuous? although sometimes it is discrete? and it usually exists over 
the whole option period until exercise’. The barrier event usually is defined to 


' History: Options for one constant continuous barrier seem to have been first catalogued 
by Rubenstein in 1990 (Ref). I performed the connection with path integrals and other 
calculations in this chapter in 1991. 


? Knock-out option: This is an option that disappears if the barrier is crossed. If the 
barrier is below the starting value of the underlying, the option is called a “down-and- 
out" option. If the barrier is above the starting value of the underlying, the option is called 
an “up-and-out” option. If the option does not knock out, it will pay off like an ordinary 
option. 


? Knock-in options: These are options that have no payoff but that make another option 
appear if the barrier is crossed. Generally, the option that appears is a standard European 
option. If the barrier is below the starting value of the underlying, the option is called a 
“down-and-in” option. If the barrier is above the starting value of the underlying, the 
option is called an *up-and-in" option. 


^ Rebates: Sometimes cash is given if the option knocks out, called a “rebate”. The 
rebate may either be paid at the time when the knockout occurs, or the payment of the 
rebate may be deferred until some later date. Essentially the rebate is a knock-in step 
option. A knockout option with rebate therefore is really a portfolio of these two options. 


? Confusions: Do not confuse the continuous nature of the boundary — at which barrier 
events may occur — with the single date of the European option exercise. 


é Discrete barriers: The barrier may also exist at a certain number of points. For 
example, the contingent cap described in Ch. 16 has a payment that is essentially a 
discrete barrier option. 


’ Partial barriers: Barriers that exist only over part of the time period to exercise are 
imaginatively called partial barriers. Partial barriers are more complicated since they 
involve different propagations over the time periods when the barrier exists and when the 
barrier does not exist. 
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occur the first time the underlying crosses the barrier". Usually the barrier is a 
constant value’. 

The barrier is given in advance, unlike the case in American options where 
the exercise boundary needs to be determined. For example, we may have a 
barrier at a given price of gold. For this reason, European barrier options are 
simpler to deal with than American options". 

Naturally, there is an exotic zoo of barrier options. If more than one variable 
is involved, the option is called a “hybrid” barrier option. For example, we may 
have an interest-rate option that changes character as the price of gold passes a 
barrier value. There may be an upper barrier and a lower barrier. These are called 
double barrier options. Double-barrier options can have events that are different 
for each boundary. There can also be a set of barriers where progressive changes 
in the options occur. Such barriers are called soft barriers''. We can also have 
barrier options mixed with average options". We will consider some of these 
exotics in more detail in Ch. 18-20. 

Barrier options exist in all product types. There are stock barrier options, FX 
barrier options, interest-rate barrier options and commodity barrier options. 
Except for the presence of the barrier, the underlying dynamics are unchanged. 
Therefore, the same set-up as for standard options is appropriate, including the 
no-arbitrage constraints, and the various complexities of the different markets. 
We can specify different types of payouts. The risk measures are substantially 
more complex than for options without barriers and the risk itself is much harder 
to hedge. This is because the barrier options contain various discontinuities. We 
will discuss risk with barrier options. 


* More complicated barrier events: There are additional exotic types of barrier events. 
For example, it might be required that the underlying stay above a barrier a certain 
number of days. 


? Non-constant barriers: These are generally harder to deal with and are much less 
common. 


10 American barrier options: If the option may be exercised at any time and if in 
addition there is a barrier, then all the complexities of American options including 
backward induction logic occur, but this time in the presence of a constraint because of 
the barrier. 


П Soft barriers: One example is index-amortizing swaps for interest rates, described in 
Ch. 16. There, a schedule is provided with progressive events occurring as the index rate 
passes through the different barriers in the schedule. 


' Barrier-average options: For example, we might require the 5-day average price to be 
above the barrier. 
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We shall use path integrals and Green functions to describe the dynamics of 
barrier options". Exactly as for ordinary options, if the parameters are simple 
enough, analytic results can be obtained. From the point of view of path integrals, 
this happens if the semigroup property allows the re-expression of successive 
propagation of the Green function over small-time increments in terms of a 
similar Green function over the total time interval. 


Say С is the Green function’* over ordeo PUMP Сы similarly is the 
Green function over [xxt ae ) , and С is the Green function over the total 
Dolce) with £, «ft, <t.. The semi-group property in the absence of 


barriers is given by an ordinary integral over x, : 
G,= | 8,6, dx, (17.1) 


The path integral exactly satisfies the semigroup property, as discussed in 
detail in Chapters 42-45. We get simple results 1f the Green function has a simple 
form, notably Gaussian. We discuss the semigroup property including a constant 
barrier below. 


Knock-Out Options 


Consider first a single barrier for a stock S at a value H and assume that 
S«H. Although we can formulate the general path integral for arbitrary 


functions H (t) , in order to get closed form results we need constant H to carry 


out the integrations in the path integral explicitly. Later on we will give the 
results for calculations assuming piecewise constant barriers of different types. 
The quickest way to get at the problem is to use the classical method of 
images to solve the problem of the Green function for the diffusion equation that 
vanishes оп a boundary ^ ". Transferring as usual to logarithmic variables, we 


P? Path Integrals, Green Functions, and Options: See Ch. 41-45 for details. 


14 Notation: For Gap, Xa = x(t;) and хь = х(%ь) are the initial and final points, with time 
interval t, = ty — ta 20. Also, we will call o the volatility (assumed constant) and р the 
drift constrained by standard no-arbitrage considerations. 


P Images: Images result from the solution of the differential equation with the 
appropriate boundary conditions. The idea is that you replace the original problem with 
another problem for which images are included and the boundary is absent. The solution 
is valid in the “physical region", i.e. in the region for which the boundary is a barrier. The 
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(ody) _ 


write x= ln S and x In H . The diffusion equation operator for lognormal 


^ д д? 
dynamics is O,,. = 42 tu . The Green function that solves the 
“O uo d. 


equation over the time interval (556) and vanishes on the boundary will be 


called the KO (knock-out) Green function e a it has the form 
о габа (17.2) 


In particular, we have С, (ко) (x, = x = 0. All paths that stay on the 


physical side of the boundary (here assumed as 5 < H ) are contained in б, (ко) ; 


If S«H,G, (ко) is relevant to an up and ош option. The first term С, in G (ro) 


ab 
is the Green function for the unconstrained problem that we have used 
repeatedly, 


1 1 2 
С, = t 17.3 
И (2207t,, ) ap 2675; |е HEUS | | MES 


The Grae?) term in б, co) =G 


= Сб") is the image Green function that is 


needed for the enforcement of the boundary condition. It is given by 
GG = K (xu) Ga | x, ox" | (17.4) 


Неге x"? (t) = 2x09) -x(t) is the called the image path of x(t). At the 


(99) in the 


so the image path is in the unphysical 


boundary, the path and the image path coincide'®. Note that with x < x 
physical region, we have x"? > x 


region on the other side of the boundary. 


use of images in diffusion problems goes back at least to 1921 (see Carslaw, ref). Many 
image references do not have drift, which requires the extra factor K(x,) in the text. 


^ What about the paths that cross over the boundary? The reader may complain that, 
after all, the paths contributing to G,, are unrestricted and some can cross over the 
boundary at S = H. The response is that we are not allowed to look at the solution in the 
unphysical region for the up and out option. Therefore, only those paths that stay below 
the barrier are relevant for the up and out option. The paths that cross the barrier are 
relevant to a different problem, the up and in option as we discuss next. 
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The factor K(x,) is needed to solve the equation O pope = 0 with drift 


present. It is given by 
K (x, ) - exp Gem -x, \ (17.5) 
o 


Equivalently, we can write 


Gr) = Ё (x,)G,, E > a 


ab 


Kx) = exp I-A? - | (17.6) 


We have K(x) (а) ев 20. -x)| 


The Semi-Group Property including a Barrier 


In describing Green functions without barriers, the semigroup property is the 
critical ingredient in proceeding from infinitesimal time steps to finite time steps. 
The path integral in general satisfies the semigroup property. Here the knock-out 
(KO) Green function has been derived assuming a constant barrier. The KO 
Green function satisfies a semigroup property, but now we must be careful to 
restrict the region of integration in the semigroup equation to be the physical 
region. Physically this is just because we are following along paths that never 
cross the barrier. The up and out Green function satisfies the following 
semigroup property: 
lod) 
Ge = | Gn Ge dx, (7.7) 


—00 


Here, the integral is over the up апа out physical region below the barrier, namely 


S«H or x « x^?) , Where again x = In S and x”) = In Н. This result can be 
proved explicitly by direct substitution and some algebra. In performing this, it is 
handy to use both expressions for the image Green function given above. 


In the down and out case, for which the physical region is 5 > Н ог 


bdy) 


x>x! , we get the semigroup property satisfied by the KO Green function as 
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GE) = [| ef? eias, (17.8) 


yov) 


We have the semigroup property for the KO Green function symbolically as" 


61? = 61^ egt? 079) 


Calculating Barrier Options 


We are now in a position to calculate knock out European constant-barrier 
options. These are given by taking the expectation of the option payoff C 


payoff 
using the KO Green function, multiplying by the discount factor that we separate 
out explicitly. For the up-and-out option (with constant parameters) we have 


(Бау) 


cron E e lab Í GEO Con (x, st ) dx, (17.10) 


Similarly, a down and out option has the limits of integration from x) to 


oo . Various payoffs can be considered. We may have a standard option payoff, a 
step payoff, a squared payoff, etc. The standard option can be a call or a put. We 
may have stock options, FX options, commodity options, etc. for the choice of 
the underlying variable. Using the integrals below we can evaluate the various 
possibilities. For example, with constant parameters (risk free rate r , volatility 


с, no-arbitrage drift w= r—o [2 in the absence of dividends), and initial 


price S, = S (t, ) , a stock down and out call option is'* 


DownOu BlackScholes image 
LEE To С (17.11) 
Неге, E is the standard Black-Scholes call option formula" 


17 Suggested homework: Do these calculations. They’re not so hard and you will get a 
feeling for the important semigroup property. Try not to look at the useful integrals below 
at least for the first half-hour. 


18 Notation: The exercise time t* is ty here, and. tap = ty - tais the time to expiration. Also, 
often the notation d; = d+, d; = d. is used. 


? Black Scholes Homework: You now have enough information to derive the Black- 
Scholes formula and the image formula yourself. More information is given in Ch. 42. 
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CU BlackScholes) _ S N(d,)- Ee *N(d ) (17.12) 


ab 


Here, d, = |In(S, /E) + (r tig? Jta (oven) . Also, С) is the image 


term, which reduces the standard call value: 


1 


| S2 D 2 | Ө | 
Ст = E3 [s 5 (а) ве ема | (17.13) 
0 


Неге, д =d, [In Sy > In(H?/S, ) . This replacement is the same as the 


(image) 
? . 


replacement x, > x 


Knock-In Options 


We now consider the KI knock-in Green function GX), This contains all paths 


that cross the barrier. This set equals all paths minus the set of all paths not 
crossing the barrier. The picture shows paths contributing to an up and in option: 


Up-in paths and their Image paths 


Ximage ( t ) 


bdy 
x, » 409) 


: bdy 
Barrier x 7 


bdy 
x, < xP”) 
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The KO Green function contains the set of all paths not crossing the barrier. 
Therefore, GX) in the KO physical region must be the usual Green function 


G,, minus the KO Green function G, (ко) . This is just the image Green function 
Ga. Physically this is because we are seeing the image paths that have 


crossed over the barrier into the physical region for KO. In addition, GE in the 


unphysical region for KO must just be С, . Physically this is because we see all 
the usual paths that have crossed over the barrier into the unphysical region for 
KO. 

If a path crosses the barrier going up, its image path crosses the barrier going 
down. The final point after the transition across the barrier can be either above or 
below the barrier. Hence the up and in Green function consists of two parts, the 
standard Green function above the barrier and the image Green function below 
the barrier. Explicitly, the KI up-in Green function is 


С if (x, ga) 
Gs М (17.14) 
“9 if (x, < xf) ) 


(bay) 


Hence if both x ,x, « x' '' we have 


Prob] (x.t, ) — (х, in dx,.t,) | = 


( as, росон (ал) ) 
| 4 (17.15 
clas, ибан) | 


The astute reader may complain again that the paths after a barrier crossing 
takes place have no restriction and may cross back over again. To satisfy this 
point, we consider the following consistency condition. We use the semi-group 
property to get the up & in Green function by following the paths. Consider the 
set of paths that start at time £,, do not cross before time t OR — dt , then 


cross in time dt, and are then unrestricted. The paths that do not cross before 
time t j4 are contained in the KO Green function from £, to f "E The Green 
function for crossing in time dt but otherwise unrestricted is the standard Green 
function propagating from /, —df to t,. The paths that are unrestricted after 
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crossing are contained in the standard Green function from time t, to the final 


time ¢,. Plugging all this in and integrating over all possible crossing times does 


produce the up and in Green function. 

Explicitly, for an upward crossing the convolution of the KO Green function 
and the Green function for crossing is found after some algebra including 
expanding in the infinitesimal df to be 


(0%) 


E ) 


aj JAC Vaya 


x 


(17.16) 


y dt | ee 
= ae — х, | : Gaussian E — x") in time A 


Taj 


(bdy) 


Multiplying by the standard Green function starting at x at time f,to f, 


and integrating over the crossing time, with a little help from the useful integrals 
in the section below, then gives the up-in Green function. 


European option values are obtained by multiplying the option payoff C 


payoff 
by the KI Green function, along with the discount factor. There will be two 
contributions corresponding to the two possibilities for the paths after the barrier 
crossing being on one side or the other of the barrier at the exercise time. 


Useful Integrals for Barrier Options 


Here are some useful integrals for calculating constant single barrier options. 
Define 


To uie . 1 е 
Г ре e| - es -x tt rn 


(17.17) 


Here 4 is a parameter. Generally, 4 = 0,1. We get after completing the square, 
I (x,,4) = exp [a(x tur ) + Acc f ÈN (voas )- N (v )| (17.18) 


1 


Here, v, = E -ln A+ uc + hot’ | and vu = Vmax (A > B). 


* 
ONT 
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We also have 
E gp 
ty, (a.0)= | ew) = | fe NEN (d, )+ e^ N(a ) (17.19) 


1 * . 
Here, d, =| -v2a + V2br ih The KO time 7,, (cf. Ch. 5, p. 47) has 


XE 


b= 1. = and a= = 7 E -xeny. All square roots are positive. Also, 
с с 
д t a @ 
ы(&5)= f Sreo -2-н) -Era (ab) 
0 (17.20) 
(7 te^" (а) - e" N(a )) 
Next, 
А dt a a 
J(a,a)= | (so Tr (r-t) 
o t? (z" -1) (17.21) 


Finally the key in unlocking the semigroup for barrier Green functions lies in the 
two following integrals 


x) 


| Gi" image) GE) dx s s Í С „Сах, (17.22) 
0d) 


y 
Gres) = Í IG, ae) Go G, Jis, (17.23) 


—00 
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Single Barrier Rebates at Touch and at Maturity 


Rebates are sometimes given when a knockout occurs as a sort of consolation 
prize. Depending on the deal, a rebate Æ can be paid at the exercise time ¢* of 


the original option, or the rebate can be paid “at touch” at the time a boundary is 
hit. The single-barrier rebate paid at touch has a closed form obtained by 


integrating е E) with т= 7, over OS c € t (cf. Eqs. 17.13 and 


17.16). Defining” 
Hf 0) а, 
= (x, x | т Lp (a,b) 


Цаво, — =) = s|- 


(17.24) 

2 2 
1 ‚\? [o : 

We set р= złr,a = — (х — x0) , #£=r—— and obtain the 

20 20 2 
single barrier rebate at touch: 
(Single Barrier _ Pia bx =x) 
Rebate at touch 0 (17.2 5) 


Redefining b= ш? |. 20? with a, и as above, the probability for no knockout 
with a single barrier is 


AKO) = 1- (аво, ~x) ion 


The single barrier rebate at maturity (with knockout before maturity) is then 


Single Barrier = -rr 
Ca at maturity — e A. L Е AKO) | (17.27) 


? Homework: Absolute Value? I believe that there is an absolute value as written on the 
right hand sice of the equation for I(a;b;xo-x™), but I have not proved it. So that's a nice 
homework problem. Send me an email if you figure it out. 
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Other Topics Involving Single Barrier Options 


Merton Model with Continuous Barrier 


A generalization by Black and Cox of the original Merton model" contains а 
continuous barrier to represent the default of a firm. In Ch. 31 we give new 
results for a two-dimensional generalized Merton model. 


A Useful Discrete Barrier Approximation 


We have been focusing on a continuous barrier. For discrete barriers, aside from 
brute-force calculations, an analytic trick often gives reasonable approximations. 
The idea is that the image Green function is present for continuous barriers and 
absent for no barrier. Therefore, we look for an interpolating factor to multiply 
the image Green function that disappears as the continuous barrier becomes 
discrete with fewer and fewer points”. 


Let At 


barrier points are regularly spaced in time). Define the factor 


be the time between the discrete points of the barrier (assume 


sampling 


T 


AO Akoga) = [==] (17.28) 
sampling 
ab 
Here, Т, is the total time over which the discrete barrier extends and у > 0 isa 
parameter to be determined. Now write the discrete KO Green function 
approximation б, ор as 


Geers) ss Ga EN xo (^r nage) (17.29) 


a sampling ) ab 


For continuous barriers, At — 0 and 1”) — 1. If the discrete barrier 


sampling 


>T, and Ж 0) 50. Hence GV?) reduces to the 


appropriate limits. Numerically it turned out that у ~ 0.6 provides а good 


disappears, At 


sampling 


approximation for a number of examples. 


?! History: I developed this discrete barrier approximation in 1997. The motivation was 
to come up with some sort of analytic approximation to perform sanity checks on Monte 
Carlo simulators for discrete barrier calculations. The approximation can be used for 
reasonable and quick approximate calculations. 
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“Potential Theory” for General Sets of Single Barriers 

We now use the semigroup property to obtain results for complex successions of 
single barriers”. In principle, there is no reason why the barriers cannot be at 
different levels. Repeatedly applying the semigroup property produces a 
sequence of convolution integrals that resembles a sort of “potential theory”. The 
results can be expressed in terms of multivariate integrals. 


An example is for two barriers called x) for times in (tt ) , and х\%®) 


for times in (t ee 3 ) . We have the result: 


(Kr) 


(Total) _ (KO) 
E ie ` d 2 н ® dt 9 баг 


(17.30) 


We use the notation © to signify convolution integration. Here is a picture: 


“Potential Theory” for a General Single Barrier 


= wna 
ES 


The integrals have to be chosen to be over the appropriate limits for the 
multiple barrier conditions. For the example, the regions of integration are 


? History: These results are in my 1993 SIAM talk; my work on barriers started in 1991. 
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(bdy) (bdy) (bdy) 


X,,X, 7X, and x, €x, , while the final point satisfies x, >x,,°’. So the 
oo oo xD 
integrals are of the form 1 ах, Í dx, 1 dx, . 
ODV) xD) —00 
bc bc 


Step Barriers 


A special case involves successive "steps", 1.e. successive single barriers with 
different levels? ". The influence of a given step on paths depends on whether 
the path sees a step “up” or a step “down” as it moves along in time. For a step 
“up”, some paths will hit the step and knock out. For a step “down”, some paths 
will migrate downward depending on the drift and the volatility. 


Complicated Barrier Options and Numerical Techniques 


In general, for complicated barriers, non-constant parameters, skewed volatility 
etc. the analytic expressions are no longer valid and numerical techniques have to 
be employed. These are usually Monte-Carlo or else discretization of the 
diffusion equation with the boundary conditions imposed. The discretization for a 
flat boundary or piecewise flat boundary is carried out most easily with a 
rectangular lattice in 5 — space, since this can be chosen to match the boundary. 


Barrier Options with Time-Dependent Drifts and Volatilities 
With time-dependent drift u(t) and volatility c(t) , the semigroup property can 


still be used to construct the path integral, but in general, the path integral cannot 
be evaluated analytically. One special case that yields tractable results occurs if 


ш? (t) / c(t) is constant if the underlying process is not an interest rate, or if the 
slope of ш? (t) / c(t) is —1 if the underlying process is the short rate (the 


change is due to discounting). However, these conditions are not realized in 
practice. 


? Barrier Step Calculations: I performed some barrier step calculations in 1996 that 
were inconclusive. These involved trying to calculate the effects of a step (i.e. two 
successive barriers at different levels but no intervening gap) explicitly. This was done by 
turning the step into a ramp over time dt. Before and after the step, the appropriate image 
solutions for constant barriers are used. The problem involved enforcing the zero 
boundary condition on the ramp. These considerations probably do not affect the validity 
of the potential theory given above. Vladimir Linetsky (Ref) has investigated step barrier 
options. He obtained results consistent with the potential theory. 
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Barrier Gaussian Rate Model with No Negative Rates? 


A long discussion of Gaussian interest rate models is in Ch. 43. While tractable, 
Gaussian models have the wart of negative interest rates. A practical problem for 
which it is tempting to use barrier technology involves modifying the Gaussian 
rate model by using an explicit rate barrier at zero, pues. along with the 
knock-out Green function including only paths for positive rates. The problem is 
that the shape of the forward curve produces time-dependent drifts that spoil the 
simple results”. 

One idea was to use a discrete step approximation to the forward curve. Then 
KO Green functions would be defined for each of the steps, and the potential 
theory convolutions would be used to get the KO Green function over large 
times, with only positive rates. 
Explicitly, we can use the r^? = 0 barrier along with local drifts for KO 
Green-functions determined through consistency with the zero-coupon bond 


market data. Each local (t, t) drift 44, is calculated to be consistent with the 


corresponding forward bond price p (sty) obtained from the forward rate 


curve today f,. We need to set r, = f, on the forward curve. 
А : i , -r pf 
Consider expectations over GEO times a local discount factor е, 


integrating only over positive rates. Then 44, is determined such that the 


expectation of the maturity t, terminal value 1 is pi’) (С) at t,- 


Once the local drifts are determined, the potential theory convolutions are 
used to generate the Green function over arbitrary times. Once the Green function 
is determined, contingent claims can be calculated. Alternatively, lattice 
numerical codes could be used. 


? Gaussian Non-Negative Rate Model using Barriers: I had this idea in the early 90’s 
and described it at Merrill at a seminar in 1/93, but there was never time for numerical 
testing. I since heard that a similar idea was implemented somewhere. 
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18. Double Barrier Options (Tech. Index 7/10) 


In this chapter, we treat double barrier options’. These are options that depend on 
one underlying variable, but which have two barriers (upper and lower). The 
underlying process starts between the two barriers. The idea is in the figure: 


Double-Barrier Path Classification for Options 


d (bdy) | 


' History: The double-barrier Green function is in my notes from 1991. Examples were 
worked out in 1996. 


2T] 
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The picture shows a double KO path and one mixed KO/KI path for each of 
the barriers. A double KI path (not shown) would cross both barriers. 


(^) and x, ) can be different. We 


may be interested in whether the paths always stay between the two barriers as a 
requirement’ (a double KO), or whether the paths cross one boundary but not the 
other (a mixed KO/KT) or whether the paths can cross either barrier (a double 
KI). The image Green function has to satisfy the diffusion equation and in 
addition be zero on both boundaries 

As usual, we assume constant continuous barriers and constant parameters so 
that we can get analytic results. Cases that are more general have a path integral 
solution, but require the usual Monte Carlo or lattice techniques for evaluation. 


The conditions on these two barriers x, 


Double Barrier Solution with an Infinite Set of Images 


The analytic solution to the single barrier problem has one image. The double 
barrier problem has an infinite set of images’. The first quartet is shown below: 


First Quartet of Images for Double Barrier 


X, ш) = Image of X na for im 
X, P LE — Usual image for xh ) 


сам 


X un — Usual image for xe ) 


аир = [mage of X; SEI for ee 


? Notation: Knock out (KO), knock in (KI), down and out (DO), up and in (UI), double 
knock out (DKO). Subscripts on a Green function Go» mean that the initial time is tọ and 
the exercise time is t* as shown in the figure. In this chapter, the Green functions do not 
include the discount factor. 
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The construction of the images is a straightforward generalization by 
induction of the situation for one barrier. Consider? the upper barrier at $ = Н " 


first, ignoring the lower barrier S = H,. At time ¢ with x = [а S(t) along a 


(image) 


given path, we construct the usual image X, for the upper barrier as 


(bdy) 


x, E = 3,09) — x where x”) = In H,,. Now consider the lower barrier 


Н, . We construct the usual image for the lower barrier as Y," *) = 2x09) — x 


where xe» = In Н, . These are just like the single-barrier images. 


Consider the upper barrier again. We now need an extra upper-barrier image 


X (image) (image) 


TU to cancel the effect of the usual lower-barrier image X, on the 


upper barrier, viz X ee = 2409 eS d oH Similarly, we need a new 


(image) 


image X, Opa) to cancel the effect of Xy on the lower barrier, viz 


X (image) = 209) cy MAD 


(image) (image) 
UL ? X UL 


. Note that the new images X,y 


are further away than are the usual images Y," *? , y, "*? These new images 


will contribute with a plus sign to cancel out the negative sign of the usual 
images. This process clearly continues indefinitely. Naturally, we need to keep 
track of the signs of the terms in order to ensure cancellations on each boundary. 

Each image is associated with its own image Green function. The image 
Green function is exactly of the form of the single barrier image Green function 
with the appropriate image parameters. Hence, the solution to the double barrier 
problem is given by an infinite set of single-barrier image Green functions with 
different arguments corresponding to the positions of the images’. 


> Notation: The barrier Н is always quoted in the underlying, say stock price, but the 
images are constructed in x, the logarithm of the stock price. In addition, we mostly leave 
the time index off in this section to avoid confusion with the barrier labels. 


^ Convergence of the Image Sums: The astute reader no doubt is worried sick about 
how well the sums of the images converge. Because successive images move further and 
further away, the quadratic exponential damping of the Gaussians in the Green functions 
provides damping. In practical terms, the rate of convergence numerically depends on 
how close the boundaries are to each other, how close the starting point is to one of the 
boundaries, the magnitude of the volatility, whether the drift carries paths near a 
boundary, and so forth. The numerical convergence can be enhanced by grouping 
together quartets of images (two successive above and two successive below). This 
quartet grouping tends to cancel out oscillation instabilities that occur when images are 
added in one at a time. Although it is not of much help, infinite image sums are related to 
theta functions that occur in elliptic function theory (see Morse and Feshbach, Ref). 
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Double Barrier Option Pricing 


The simplest double-barrier option is perhaps the double knockout DKO option 
(up-out OU across the upper barrier, down-out DO across the lower barrier), 


с[оо(н ics DO(H 1)1. The DKO option is given by the usual Black-Scholes 


result corrected image terms conveniently expressed via infinite sums of 
images arranged by quartets (we have explicitly exhibited the first quartet above). 
The sums contain Black-Scholes option expressions (multiplied by the extra 
K (x, ) factors), with appropriate image parameters. There are also digital 


options that occur when a barrier stops a payoff from being able to go to zero 
when the underlying equals the strike. 

For example, a double knockout (DKO) European call option is the integral 
between the barriers of the payoff at exercise multiplied by the DKO Green 


function G oy = Gye — GU? -") апа discounted back from the exercise 
date, 
In Hr; " 
CF! ERN ME f dx'e"* GOO) С i -E| (18.1) 
In, T 


If the strike is between the barriers, Н, < E < H,, (the usual case) we get the 
expression for the DKO call as 


CPR (у) е" [ax gp e - А 


ЊЕ 


е" [ag рее Hy | [Hy Ele" | а GRO 


In Hi; In Hy 
(18.2) 


= ` Nmax K Index Сайыы (Е )- Call s (Н v) T. D ig. ital dex 


Index=0,U ,L,LU UL... 


(18.3) 


Here the index runs over the initial t} point x, (Index = 0) and over all images 
(Index =U,L,LU, UL s) at the initial time /,. The signs are given by 


"index = tl and the extra image factors by К (with К, о =1). We have 


Index 
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rearranged the terms in order to exhibit the digital pieces” Digital, The other 


ndex * 


terms marked Call are just Black-Scholes expressions with the strike 


Index 
indicated, and appropriate arguments corresponding to the index for the images’. 
It is important to calculate the probability A(DKO) that knock out does not 


occur, i.e. the probability that a path always stays between the two barriers. This 
is clearly given by 


In Hi; 


P (DKO) = f dx” GEO) (18.4) 


hH; 


This expression is evaluated as above using the infinite series of images. We get 


P(DKO)= > Minder ® maex {N Е d Index (Hy )) + N|4, ines (H, )) = i} (18.5) 


Index 


Here, d, „дох 18 the usual Black-Scholes quantity, with arguments appropriate for 
the given index. 
We also need the DKO digital call option Cali Ko) 


Digital with strike E . This is an 


option that pays an amount /, provided that the paths stay between the two 
(DKO) 


Digital ? 
£(DKO) is replaced by In E. The integral is multiplied by № and by the 


discount factor. 

Simple path counting logic gives sum rules that provide the results for mixed 
KI/KO options. For example, a mixed KI/KO option consisting of up-in (UI) 
across the upper barrier and down-out (DO) across the lower barrier that we call 


C [UI (Н oe DO(H 1)] is given by ће KI/KO “sum rule", 


boundaries and that at exercise x* > In E. To obtain Call the lower limit in 


* Dangerous Digital Options: Most of the complexities of barrier options arise from the 
digital components. If, as time passes, the underlying gets close to one of the barriers, the 
option will disappear (or get replaced by a rebate amount). This discontinuous change is 
naturally difficult to hedge. If the deal is large enough, a barrier that is has a probability 
of being crossed becomes a red flag and a focus for risk management. Some details about 
digital options and how they are hedged are given in Ch. 15. 


* DKO Option Expression: The astute reader may be nervously wondering why call 
options with strikes at the upper barrier are present in the DKO expression when the 
DKO paths are restricted to remain between the barriers. As is clear from the derivation, 
these unphysical terms actually cancel out. The expression is useful for numerical 
computation. 
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C[UI(H, ),DO(H,)]=C[DO(H,)|-C[UO(H, ),DO(H,)] (18.6) 


Here с[ро(н a)l is the single-barrier down-out option across the lower 


barrier and C [UO(H "I DO(H 1)] is the double barrier knockout option. What 
this means is that we count | к: that do not cross the lower barrier regardless 
of what happens at the upper barrier; this gives C [ро(н Jl: Those paths that 


also stay below the upper barrier contribute to C [оо(н v) DO(H a)l and 


those that cross the upper barrier contribute to C [Ur (Н a DO(H | The sum 


rule simply follows from tracking the paths. 

Similar considerations give the double KI (DKI) options that get 
contributions from paths; each relevant path must cross both barriers. The DKI 
sum rule is 


c[ur(,).01(1,)]= c с[ро(н,)] 


(18.7) 
-с[оо(н,)|+с|оо(н, ).DO(H, )| 


Here, C [вале just the European option with no barrier. If we ask for the 
option where either one or two knock-ins occur we need to add the two single 
barrier terms back in and watch out for double counting; we get another DKI sum 
rule, 


C[UI (H, )and/or DI (H,) | c"**"*! -C[ UO(H,),DO(H,)] (18.8) 


There are special anomalous cases worth mentioning. These include a call 
with the strike below both barriers, and a put with strike above both barriers. 
Extra digital step options appear in those anomalous cases. There are some extra 
numerical problems involved with evaluating these digital options. 


Rebates for Double Barrier Options 


A rebate / can be paid at the exercise time ¢* of the original option, or the 
rebate can be paid “at touch" at the time a boundary is hit. Rebates for single 
barrier options were discussed in Ch. 17. Rebates for double barrier options are 
more involved, as we now discuss. 
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Rebates for paths that hit a given boundary and that are paid at maturity are 
digital versions of KI options and are obtained with the KI/KO sum rule applied 
to digitals. If the rebate is paid if either boundary is hit, a simple result is obtained 


то = e" 1 A DKO)| ds 


In principle one can also have two-sided rebates where both boundaries have 
to be crossed before the rebate is paid. If this rebate is paid at maturity, the DKI 
sum rule, applied to digitals, gives the result. 

Rebates at touch require that time integrals be performed including the 
discount factor. A rebate at touch is worth more than a rebate paid at maturity 
because there is less discounting. Rebates at touch are difficult to evaluate 
analytically for double barrier options. A reasonable approximation can be 
obtained as follows. We replace the usual discount factor from expiration 


exp Е with an “effective” discount factor exp (rz, ) . The "effective time 


interval" 7^, is г times the probability that knockout does not occur. If the KO 


eff 
probability decreases, p increases, since knockout will generally occur later. 


Relation for Rebates at Maturity for Double and Single Barrier 
Eq. (18.9) gives the single barrier rebate paid at maturity if 7 (DKO) is 


replaced by the single-barrier probability AKO) given in Ch. 17’. 


7 No Knockout Probabilities: P(DKO) is the probability for no knockout staying 
between two barriers and P(KO) is the probability for no knockout with one barrier. 
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19. Hybrid 2-р Barrier Options (Tech. Index 7/10) 


Our purpose in this chapter is to set up the mathematics used to describe the two- 
dimensional hybrid option formalism.'” We discussed these options in Ch. 15. 
The barrier options are assumed European with a single continuous barrier. There 
are two underlying variables. 


We call the two underlying variables S (t) and P(t) . We assume that these 
two variables are described by correlated lognormal processes with correlation 
p. So we write x(t)=InS(t) and y(t)=In P(t). The first variable x(t) 
contains all the information about the cash flows. The second “barrier” variable 
y(t) contains the information about the barrier, defined for simplicity as a 


constant y^) 


-lnH . We write? the (constant) volatilities as 0,,0, and the 
drifts as ш, —r,—q, -07 [2, u, =r, Ady -o, /2 . The initial point and 
time are xy, ¢ and the final (exercise) point and time are x,t. We write 
r =f — t, and call (Рх). = X, -x + UT and (Dy),. = у, -y + ee It 
is also useful to define a quantity (Dy " m that is the component of (Dy),. 


geometrically perpendicular to (Dx) EL 


9, 
(Dy. = (Dy). — 0 — (2). (19.1) 


x 


The 2D Green function in the absence of barriers that we call Go”) is given by 


' History: I did most of the hybrid option calculations in the text in 1993. The correlated 
default calculations were done at Bloomberg LP. 


? Acknowledgement: I thank James Turetsky for helpful conversations. 


? Notation: The parameters г, and qx are the risk-free rate and dividend yield for S, while 
ry and qy are the risk-free rate and dividend yield for P. The forms of the drifts are 
standard and follow from academic no-arbitrage considerations. 
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Gi? = Ny. exp| ФУ” | (19.2) 


" y2 1! 
Here, Ny = LL (1 - ø) | апа Ф022) is the usual 


yon [ә], totos ы, 
"o ai-e] e XE ur | 


Ф022) is nicely separated using the above variables as 


o0» (2х), li | (Dy, »] 


2D) = С z (19.4) 
{ 2027 2(1- p" )o?r 
We need the two-dimensional image Green function Gores) generalizing 


the one-dimensional case. The idea is the same. We need to solve the diffusion 
equation appropriate to the assumptions made in setting up the problem, 


consistent with the boundary conditions implied by the existence of the barrier. 
The 2D image Green function СР) can be obtained by noting that, as far 


as the variable y is concerned, the x variable enters qo» through (Dy n" ) p Bor 


* С 
fixed х ме сап think of the term — p—(Dx) o. as providing a modified drift in 


x 


(Dy),.. We can then use the one-dimensional formalism for images in the 
variable y. Define the image path for y (t) as 
lime) (t) = 2 09) —y(t)=2InH-y(t). We replace y, in qe» by the 


(image) (bdy) (2D » image) 


image yj =2y — y, to get Фу . Explicitly, we write 
[ (Dx) li (руб) | 
ФЕ = 0* 0* (19.5) 


20?т' 2(15:p* Jes 


Here, 
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image image * * o 
(ру Е e) y'eur J-o- (рх), (19.6) 


We now write the 2D image Green function in the same form as for опе 
dimension, 


Gg SNK exp Dy | (19.7) 


The extra factor Ke?) is tricky. The one-dimensional analogy does not give 


the whole story. An x - independent term is required. This 1s obtained directly 


(2D T image) 


from the requirement that С solve the diffusion equation. The diffusion 


equation’ for T` > 0 is Ó. „© = 0 where the 2D backward diffusion operator is 


=0,+ 1,0, +H, 0, t0? з EE с? + poo e (19.8) 


Xo Oy Yoy X Y XW 


We find 


(19.9) 


The first term in the square bracket proportional to p is the extra term. With 


the 2D image Green function now determined, we write the total 2D Green 
(2D_ KO) 


function G! a ) for knock-out in the presence of the barrier as 


Gor) = Ge) iG ee (19.10) 


The reader is no doubt anxiously awaiting the news that the 2D knockout 


Green function satisfies the semigroup property. It does. 
The 2D knock-in Green function С Ce K) is given by the same sort of 


2D 2D i e 
expressions as in one dimension, involving both GP and GQP- 


^ Backward Equation: The backward diffusion equation is with respect to the initial 
variables indicated by the subscript 0. 
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depending on the values of x . The sum of the integrated probabilities of the KO 
and KI Green functions has to be one, e.g. 


Prob (Up + Out) + Prob(Up + In) =1 (19.11) 


Pricing the Barrier 2-Dimension Hybrid Options 


We are now prepared to price barrier 2D hybrid options. Again, we are assuming 
constant continuous barrier and European-style options. Call the payoff cash flow 


at the exercise date C (x : t) . For example for a call option, 


$C(x",1")=$P-¢.-[exp(x")-£ | (19.12) 
+ 

Here $P is a principal or notional amount and c, is an extra factor depending 
on the details. Examples of c, are one for a single stock option and 


Nass /360 x = 1/4 for a 3 month Libor caplet. Call the discount factor (zero 


coupon bond) as P ae р The expected discounted value of Gs Ni г) using the 


full 2D Green function G (20) is the 2D option value $C (xt) at today’s time 


tọ . We get, with appropriate limits on the integrals, 


$C (xf) = ar) fas [ar of? О) (19.13) 
Хп Уюп 


We may also get a combination of such integrals, depending оп Ше 
circumstances. We avoid the temptation to list all the cases, but instead give 
some useful integrals, with which the interested reader can derive all cases '. 


Correlated Default Calculations 


See Ch. 31 for a model of correlated default calculations using the results of this 
chapter. 
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Useful Integrals for 2D Barrier Options 
Sed Ann > 2 3 2 * & ^ x 
[ax pay exp|- E (X) «AQ -8€ y «ax +С, | 
Xmin Jus 


= 1, na ) + 1, ( Xnins Yain ) — 2 (cona )—1› nda) 


(19.14) 


& J 
Г, (3,5) = f dx f dy. ехр{-| 4 Cay +4, (^) -Bx y + Cx + c] 


= - т М |(#+Х) 24, (1- p) x) 24 (1- p*):p | 


(19.15) 
dcs B 
In Eq. (19.15), the correlation is o — and 
AA, 
x CB H2CA, „_СВ+2С,А n (C74, - C/ A4 +C,C,B) 
444-B° AAAS В 4AA,— В? 
(19.16) 


Also, N, (a,b; p) is the usual bivariate integral, 


1 2 2 
N, (a,b; p) = ntis ey)" 2pxy + у?) 
(19.17) 


A mix of bivariate numerical methods can be useful, depending on parameters; 
see the references’. 
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20. Average-Rate Options (Tech. Index 8/10) 


Options often contain arithmetic averaging features. In this chapter, we use path- 
integral techniques’ to obtain some general results'. The reader is referred to the 
chapters on path integrals (Ch. 41-45) for more details. 

To get an idea of the problem, consider the two figures below. The first figure 


has the valuation date /, during or inside the averaging period’. 


Averaging with ¢, During Averaging Period 


Total Averaging Period Г, 


«——————————————— 


Already Not Yet 
Reset in T^, Reset in Ту 


' History: These calculations were reported in my SIAM talk in 1993 and performed in 
1989-93. Similar results can be obtained for equities and FX. The difference is that for 
interest-rate options the discount factor gets involved in the averaging calculations, which 
are thus more difficult. 


? Notation: Time intervals are denoted by Tap = tb — ta. 
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The second figure has valuation date ¢, before the averaging period starts. 


Averaging with ¢, Before Averaging Period 


Ты Before Total Averaging 


Averaging Starts Period T;, 


Not Reset at 
all in T 


Arithmetic Average Rate Options in General Gaussian Models 


Arithmetic-average rate options can be done exactly in Gaussian rate models. We 
exhibit the first part of the evaluation of average rate options with any arbitrary 
Gaussian model. Then we will get an explicit form using the Mean-Reverting 
Gaussian (MRG) model. 


Again, we want to evaluate the option today at time /,. Either f, is before 
the averaging period (t, d starts at £, or else f, is inside the averaging period. 


If t, is after the averaging period is over, there is nothing left to average. 


Consider a time-partitioned discretization of the interest rate we want to 
average. If we are inside the averaging period, some rates have already been 


determined or “reset”; call them {s,} ,I-1..L. Call i5] ,Jj20..N-L the 


undetermined rates. Now define the general linear average as 


? Reset Rates, Systems, and the Back Office: The back office is where the 
administration of the deals takes place. A well-oiled back office is critical to the 
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F(r) CES met э Д (20.1) 


We have included the possibility of some arbitrary linear coefficients {é d А 
The most common situation is just simple averaging, 6, = [.. 


Next, we introduce a dummy variable F through a Dirac delta function, 
using the identity 


l= [5[Ё-к()].аЁ (20.2) 


—00 


The Fourier decomposition of the 5-function introduces another variable р, 


А * dp Ор, 
ô| Ê -F(r)] = J F exp| ip [ê F(r)] (20.3) 
The oscillations kill the integral if FF (r) and gives oo if F=F (r). 
Now the value of the payoff C " for the option at time Ü depends on the path 
that the underlying rate takes from 1, to ѓ and on the averaging function. We 
call the option payoff С” [F(r)]. Now C [F (r)| can be used as a “test 


function" for the 6-function, meaning simply that we can write the identity 


ex 
I 
Y 
c 
рки 
11 
з ——8 
су 
im 
Q 


[Ê -F (r) | -dÊ 
(20.4) 


functioning of a desk. One of the tasks is to maintain a system containing a database of 
the reset values that need to be input to the model in the system. By the way, this already 
shows one way that a real system is much more than just a model calculator containing 
the algorithm being described. System risk is discussed in Ch. 34-35. 
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We call C the value of the option at the valuation date (today), t). We use 
variables ixl where x, is the deviation of r, from the “classical path".^ We 


need the expectation value using the path integral for the Green function 
Gl ror soit k: which in this chapter includes a discount factor. The path 


integral depends on all values of the paths from f, to t. We have 
C= [G(x.x:5,6)- c" [F(r)]ax (20.5) 


Using an expectation notation and using the above identity, we get 


C= [а С (2). [2 (2). j dr’ G,.(p) (20.6) 


—00 


Неге, 
EADE exp -[rt)a exp  ipF (r) | (20.7) 
Ra Я 


The expectation notation is just shorthand for the path integration апа the 
discount factor is really discretized. Now we need the general integral that 
appears in the path integration for Gaussian models, 


P m © ip m m m 
L(p)s f Í АСЕ Lae =) Dr Е >, ТА! 
Jal -®© j=l j=l Jj 
(20.8) 
This is evaluated by completing the square as 
1 (p) == reno ip t | (20.9) 
"V^ (det Ay” 2 | 


Here, 


* Mean-reverting Gaussian model Classical Path and Green function: Classical path: 
see eqs. 43.26 and 43.41; the Green function is eq. 43.32. 
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ЕЕ 
1 
b =— (47 " А 
swap 2,5 5t EM 
ЖЛЕ 


Finally, we need 


A 2 
n dp БИГЕ, 1 2 | 1 (7-а) 
——ехр+—їрЁ +іар – —Ь = ех 20.11 
J D P| i zu 7 (20.11) 
With this, we obtain the exact result 
a 2 
ATL T F-a 
C - Р [ af-c (Ê) 1 exp | a) (20.12) 
A NP 2b 


Here, pi ) is today’s zero-coupon bond of maturity Ü producing the discounting. 


Examples: Average-Rate Caplet, Digital/Step Option 
For a few special cases, consider an average-rate caplet payoff, 


c'|F(r)|-[F(r)- £Je[F()- £] (20.13) 
We get the average-rate caplet value in this model as 


X (a- E) 


aM d -E)n ZE) 5 20.14 
(a ) " NT 7 ( ) 


Another example is the average all-or-nothing step or digital option. The 
payoff is 
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c'|[F(r)]|- e[F(r)-£ | (20.15) 


The average-rate all-or-nothing step or digital option is 


c= нЕ, (20.16) 


Results for Average-Rate Options in the MRG Model 


In order to obtain explicit results for the quantities a and b? , we use the mean- 
reverting Gaussian model with constant volatility с and mean reversion œ . We 


also take uniform averaging, 2 —1, for simplicity. We obtain two separate 
ging, с, p p 


results depending on whether fọ is before the averaging starts ог is in the 
averaging period. The calculations are messy but straightforward. 
The main trick is to combine all terms in the discretized interest rate rs 


exp Е + ipr, [(N + 1)| = exp| —r,h(p)dt | (20.17) 


Here, h( Р) -l-ip/T, where T, = Ü —t, is the complete averaging period. 
Now we change variables, scaling rates by h( р), and we define p-dependent 


“volatilities” by с? (p) = л? (р) с? . The integrations can then be carried out. 


Valuation date Before Averaging Period Starts 


If f, is before the averaging period starts, we get 


b eda? |Т (т, 07.) + ior.) 


л, ilg 
(Naja ^ QT 


(20.18) 


Here, f, is the forward rate at time f, (as determined today), 7), = t, — tọ is the 
time interval up to the start of the averaging and again Т, =f —t, is the 


complete averaging period. The generalized variance p gets a full contribution 
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from the period 7,,. It is suppressed by a factor 1/3 in the averaging period Т, 
up to some extra factors”. The extra factors are 


1 жү OL ye 
RPM 2 (ws, (OTa) $ = \ i | (20.19) 
gon) or. +2(¢ we 1) | reu 1) 


Valuation date During the Averaging Period 


If f, is inside the averaging period we get the results 


2 3 


Th. 
p= ZT ж (oy) 


a* 


1 E N-L І ор 
аат) 252. 29 dis 


(20.20) 


Now the variance is suppressed by 1/3 times an extra factor. 


Simple Harmonic Oscillator Derivation for Average Options 


The problem of average options can be done two other ways. The second method 
is using the quantum mechanic simple harmonic oscillator SHO analogy". We 


need to replace ће SHO in quantum mechanics using @ — іо, where i = a Ali, 
This turns the Schrédinger equation into the diffusion equation. It also gets rid of 
the oscillations that occur in quantum mechanics, but not in finance. 

As described in Ch. 43, App. A, we use the MRG stochastic equation at each 
time /, , namely in Е Í m а) which is the Gaussian measure we set 


—00 


1, = Ee A ( 7 odt) (са) 


5 Estimates of volatility when part of the period involves averaging: These formulas 
can be used for quick estimates of the total volatility in deals. 
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(cl) 


, Here, x у = =r; measures the random difference of the short term rate 


from the classical path a at time ¢; . The classical path is given by the forward 
rate f, up to a convexity correction. 


In carrying out the calculation, we need to make sure we include the surface 
terms, namely the cross terms. These are generally unimportant, but here the 
surface terms cannot be ignored. After some algebra, we get the same result as 
obtained above using explicit multivariate integration. 


Thermodynamic Identity Derivation for Average Options 


The third method is to use thermodynamic equalities, valid for Gaussian models. 
We have 


(exp(X)) = exp] ne Q7 | (20.21) 


Here, the “connected part” is (x? ) = uc ) — (ху . We need to choose 


t* 
x --[r(tat ipF (r) (20.22) 
to 
Contact with the first derivation is established by recalling that 
(exp(X))= [ а" 6%. (р) (20.23) 


We can evaluate (X i i (k =i; 2) explicitly and obtain the same answer. 


Average Options with Log-Normal Rate Dynamics 


Most applications assume lognormal behavior. However, the sum of lognormal 
variables is not lognormal’. For this reason, analytic techniques cannot be used 
for exact results for arithmetic average options in a lognormal world. 

There is a closed form for a geometric-average options using lognormal 
dynamics, derived by Turnbull and Wakeman”. 
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Turnbull and Wakeman showed that a moment expansion could be used for 
reasonable numerical approximations to an arithmetic-average option with 
underlying lognormal dynamics; see also Levy ™. 

This moment method was criticized by Milevsky and Posner " as being 
inaccurate for long dated trades or cases with high volatility. These authors 
suggest using a reciprocal gamma probability distribution, in which case exact 
results are obtained for arithmetic-average options. 

Berger et. al. document a number of Asian options and the techniques used to 
solve them. 


iv 


Gaussian into Lognormal Using a Simple Trick 


Here we show one simple way to get approximate results for lognormal dynamics 
from Gaussian model results. Consider the results for the average caplet and 
average digital step option above. It is easy to see that 


С 


caplet 


@ 
2 
= [ -E+ b Z eus (20.24) 
The forms for the parameters a,b are given above. Now set y — (a -E ) /b, 


бу = b/(a + E). Assume a « E and b «|a + E |, so the options are near at-the- 


money and the volatility is small. Then we have the approximations 


[acie N) амаз) Ену) (20.25) 
24 


(а-Е) 1 un" (20.26). 


b (Б/Е) XE 
With these approximations, we get the result 


С 


caplet 


= Pl) [aN (d,)—EN(d_)] (20.27) 


Here, d, = (E/b)| In(a/ E) P /(2E" ) . Using Ty. =t —t), we define 


the average lognormal volatility о, as 


Du PUEDE (20.28) 
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This redefinition of the volatility then puts the above expression for Сое, into 


the canonical lognormal form’. 
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21. Fat Tail Volatility (Tech. Index 5/10) 


In this chapter, we look at fat tails in distributions of underlying variable moves 
from a practical perspective. We are especially concerned with obtaining some 
sort of volatility for fat tails. We introduce the idea using some examples, and 
deal with practical questions at the end. 


Gaussian Behavior and Deviations from Gaussian 


It has been known for many years that the probability distribution of underlying 
changes of financial variables d,x(t) = x(t - dt) — x(t) over time interval dt is 
at best only approximately Gaussian. Practically all production models in finance 
use Gaussian assumptions, either associating x(/) directly with a financial 


variable or using some simple transformation like x(t) = Inr() which leads to 


the lognormal model’. The description of deviations from Gaussian behavior 
forms a large part of this book. In this chapter, we will focus on jump outliers, 
giving rise to “fat tails". 


Review of Some Math Formalism 


Gaussian behavior means we assume d,x(t)=[u(t)+o(x,t)m(t)|dt. The 
stochastic variable 7(t) satisfies (17° (t)) = 1/dt , and has probability distribution 
s [n(r)] = Jat/2z. exp| -7 (г) а/2 |. In addition, for different times we 
have the delta-function normalization (z(1)7(t')) =6(t-t'). An equivalent 


notation is 7(t) = dz (t) / dt. The drift function y(t) is assumed deterministic 


and fixed by no-arbitrage arguments. The substitution of the stochastic equation 


' Is there a Transformation of the Data to Gaussian? A possible exercise would be to 
try to find a transformation of variables so that the data exhibit a Gaussian probability 
distribution. Because there are so many ways that the Gaussian behavior is broken, it is 
possible that such an exercise would be highly unstable and idiosyncratically dependent 
on the individual variables and the time intervals. The existence of macroscopic large 
time scale parameters and the presence of jumps or gapping behavior at short time scales 
are essential complications. 
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in the probability distribution for n(t) leads to the Gaussian probability 


distribution for d,x ( t) , namely” 


persie а x(t) ] = 1 | |d.x(t)- u(t)at | 


21.1 
2ло? (t)dt TAE 20? (t) dt а. 


Outliers and Fat Tails 


Example of a Fat-Tail Event in the Real World 


The reader might need to be convinced that there is a fat-tail problem. To this 
end, here is the example used in my CIFEr tutorials. On June 2, 1995 at 8:30 am, 
the 10-year US treasury yield dropped 35 bp in half an hour. Assume Gaussian 


behavior of the logarithm of yield changes, scale this move by «time for one 
year’ and write the result as Дд) = 35Ьр./1уг/30тїп = k ro lyr . 
This produces k, « 20 standard deviations, occurring with probability 10. 


To drive home what this means, 10 ^? happens to be on the order of the 
probability of specifying at random a tiny volume containing one proton in the 
whole observable universe. Since all this is absurd, the conclusion is that these fat 
tails manifestly violate the Gaussian assumption. 

Notable fat-tail events include the 1987 crash that was the winner for percent 
changes in equity-land. For FX, huge fat-tail moves occur when a pegged 
currency (with an artificially small volatility with respect to USD) is unpegged. 
Other examples exist in emerging markets, commodities, etc. 


? Path Integrals: There is an additional step involving integrations over the x(t+dt) 
variable and the n(t) variable that we have not shown. For the details, see Ch. 42, 43 on 
Path Integrals. All you have to do to get the general path integral is to repeat the above 
procedure at each time t, take the product, and then integrate over the physical region of 
the underlying variable at each time. 


> Fat-Tail Event Parameters: Recall, the mathematical definition is that Gaussian 
behavior of changes which is equivalent to Brownian motion is supposed to hold at all 
time scales. We used the lognormal yield volatility o — 1596, 252 business days/yr, 6.5 
trading hrs /day along with a rate of 6.296/yr, producing 21.5 standard deviations. 


^ Story: Many Fat-Tail Examples: I used to collect printouts of great fat-tail events 
pinned up on a bulletin board. Eventually, there were so many pieces of paper that the pin 
couldn't hold them. They all fell down on the floor. 
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Gaussian Random Numbers for MC Simulations in Excel 


We will need to know how to run simple Gaussian Monte Carlo simulations, and 
we will learn a little about Excel in the process. Gaussian random numbers are 


obtained via R, = (-2 InU, )? cos(2zU, ) , R= (-2 InU, y" sin(2zU, ) with 


U, , = О (0,1) uniform random numbers "E 


Simple Illustrative Fat-Tail Monte-Carlo Model in Excel 
Here is a simple pedagogical example illustrating fat tails using a volatility that 
depends on the changes, o = c [d,x]. Gaussian random numbers {G d | | 


А =1,...‚ М yc, are sorted from smallest to largest’ so С, < G,,,. This orders the 
{d,x} numerically. A volatility o, is assigned to G, as deterministically 
increasing, viz с, <o,,,. The graph of the volatilities will be shown below. The 


illustrative model is dx (0,;с,) = с,С,. The histogram below" gives 
the results, along with a standard model using a constant volatility с, and the 


" Std 
same random numbers, viz d, x (6,505) = о,С,. 


5 Random Numbers in Excel: To get a uniform random number in Excel, type rand() 
and hit Enter. You should check the mean and standard deviation of a set of random 
numbers to see if they meet your accuracy criteria. An annoying feature of Excel 1s that 
the random numbers change every time you enter something in any cell unless you 
specify “manual calculation” under Tools\Options. Apparently, there is no control over 
the seed. Hence, you may want to save a set of random numbers. To do this, put the 
curser on the first random number in a column, hold down Shift-Ctrl and press the down 
arrow key to highlight all the random numbers. Copy the numbers using Ctrl-C, click the 
mouse in an empty column, and then use Edit WPaste-Special with the “Values” option. 


* Question to the Reader: You are using Excel (or some equally facile program) to do 
your prototyping, right? 


7 Sorting in Excel: Highlight the column to be sorted, hit Data\Sort\Ascending. You 
probably already knew that. 


* Parameters: In the example, Nyc = 100 and o, = 1.5 and a deterministic fat-tail vol 
prescription starting at o =1.0 at the smallest dix to o =3.0 at the largest dax. 
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Fat Tail Data (light) vs. Standard Gaussian without fat tails (dark). 
Probability (vertical axis) vs. dix return (horizontal axis) 
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Fat Tail in data, not in standard model 


The fat tails are seen as more events or higher probability than the standard 
model would predict at large values of d,x , due to the assumed larger volatilities 


at larger d,x . 


Inversion for the Fat-Tail Volatility at Definite CL in the Model 


This simple model also serves to illustrate another point. We know, by 
construction for the fat tail model, which volatility corresponded to which point. 
Now suppose we announce we would like to look at the 99" percent confidence 
level (CL). For a large number of points, we know that the Gaussian with unit 
width has Куусу ~ 2.33. Therefore we can calculate the volatility at the 99% CL 


for any distribution as с" = q x?"*- /2.33. Similar remarks hold at other 
CL’. 


? Confidence Levels: We obviously cannot choose CL around di = 0 with zero standard 
deviations. There is also an edge problem for the CL. To make the distribution with 100 
points symmetric, the first point has CL 0.5% and the last point CL 99.5%. The results 
will fluctuate; these fluctuations decrease as the number of points in the MC simulation 
increase. Since only generated 100 points were generated, the graph is just meant for 
illustration. 
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The graph below shows the results for the above model for all CL above 
68%. In the example with definite input volatilities, the calculated volatilities 
roughly reproduce the input volatilities, as they should. 


Calculated Vol vs Input Vol; Pedagogical Model 


—a— Vol Calc —4— Vol Input 
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Use of the Equivalent Gaussian Fat-Tail Volatility 


In the real world, d,x is not generated by a Gaussian with any volatility. Still, 


10,1 


the procedure we use 1°" is similar. It consists of the following ѕќерѕ:'! 


1. Choose a tail CL, e.g. 99% CL. Get da from the data. 


2. Get the 99% CL “Fat-Tail” vol as с = ах" [ 33. 


3. Generate the "Fat-Tail" Gaussian using this "Fat-Tail" vol. 
4. As a refinement, define separate FT vols for dx > 0, d,x <0. 


10 History: Fat-tail Gaussians were used in 1988-89 ago to describe interest-rate outliers; 
c.f. Dash and Beilis (Ref). A physicist colleague Naren Bali earlier had the idea for fat- 
tail Gaussians. While we were fitting high-energy data in 1974, he said “You can fit your 
grandmother with Gaussians”. Later the wavelet mathematicians got the same idea. 


!! Note: At high confidence level for dix the mean u is negligeable, dx – u ~ dx. 
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By definition, we get the Fat Tail Gaussian empirical distribution as correct at 
the chosen tail CL. We use the Fat-Tail Gaussian to model the rest of the fat tail 
qualitatively, in an integrated fashion. The idea is in the figure below: 


Fat Tail Data (light) vs. Fat-Tail Gaussian Model (dark). 
Probability (vertical axis) vs. dix return (horizontal axis) 


20% 
18% | 
16% | 
14% | 
È 12% | 
E 10% 4 
9 8% | 
© e% | 
4% | 
2% | 
0% 
The “Р-Р Plot” 


Consider the two histograms above. The Fat-Tail Gaussian model reproduces the 
data tail reasonably well. This means e.g. that at the 99% CL, the model value for 


dx is similar to the data value. However for the Standard Gaussian model needs 
a higher CL than for the data to get the same value of dx. For a given value of 


dx (or in the loss of a portfolio etc), we can plot the CL for the data on the 


horizontal x-axis and the CL for the model on the vertical y-axis. If the model 
exactly reproduces the data risk, the plot will just be a line at 45 degrees, shown 
in the p-p plot figure below. Also shown is a hypothetical p-p plot generated by 
taking a multiplier of 1.1 for the standard Gaussian model confidence levels 
relative to the data. The hypothetical model risk is too low relative to the data, 
since the model loss around 99% CL is the same as the data loss at only 95% CL, 
so the loss tail in the data is not described by the model. 
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Фр vs. p (y=model correctly reproducing risk vs. x=data) => 45 degree line 


Bip vs. p (y=model risk too low vs. x=data). Model loss (99% CL) = Data loss (95% CL) 


Practical Considerations for the Fat- Tail Parameters 


The discerning reader at this point 1s no doubt thinking of many objections to the 
above procedure. Why choose the 99% CL? What about asymmetric 
distributions? Why use Gaussians at all? We have no easy answers, and we do 
not think any easy answers exist. There is no theory of fat tails that is convincing. 
We offer the following observations. 


Why Use Fat-Tail Gaussians and not Other Distributions? 


One justification of using fat-tail Gaussians follows a long history of using 
simple functions to describe variations phenomenologically in science. In the 
end, the fat-tail Gaussians are simple to use and explain to management. There is 
no good reason to use functions that are more complicated in the absence of 
convincing phenomenology". Moreover, with only a few points in the tail to fit, 


12 Convincing Theory or Phenomenology of Fat Tails? A convincing theory is more 
than just a lot of high-level mathematics. We also need to see a lot of phenomenology in 
many markets. We also would want stability of the parameters over time, and also 
stability with respect to different time windows. Having said that, we believe that the 
Reggeon Field Theory is a promising candidate to consider, as described in Ch. 46. 
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we have not found the robustness of the description using other functions any 
more convincing than FT Gaussians. 


How do you know that Fat-Tail Gaussians are Really Fat? 


99% CL 


Good question. A possible (though rare) occurrence is that ће d,x in the 


data is actually smaller than 2.330,,, with с, the usual standard deviation for 
all d,x data. Therefore, a conservative approach is to define the FT vol as the 


maximum of c... as defined above and с. 


How do you justify the use of the 99% CL for the Fat Tail Vol? 


There is no unique choice. Practically speaking, the 99% CL is a reasonable 
compromise. Three years of data (750 data points) represents a large effort of 
data collection when all markets are considered. The 99% CL for 750 points 
interpolates between the 7" and 8" biggest move and leaves 7 points for the tail, 
which “sounds” reasonable. A good procedure is to try different tail CLs and look 
at the variation in the final risk results. To be really conservative, you can use the 
maximum of the Fat-Tail vols over, say, points at and beyond the 99% CL. The 
sensitivity depends on the question being asked (see the next section). 


Bis: How do you justify the use of 99% CL for Fat Tail Vol? 


When MC simulations are run over diversified portfolios at a very high CL (e.g. 
99.97% CL ") past the tail CL of 99%, numerically large contributors as a rule of 
thumb tend to have no more than around 2 fat-tail standard deviations. Therefore, 
some internal consistency is achieved by defining the fat-tail volatility at a tail 
CL of around 99% CL. Moreover, in practice, not much change in the results at a 
99.97% CL is observed with different choices of the tail CL defining the fat-tail 
vol. However, substantial changes can be observed in the maximum loss. 


What About the Central Part of the Distribution? Forget it. 
The fact that the central part of the d,x(t) distribution is not described by a 


Gaussian with the FT vol is a correct observation. However, the central part of 
the distribution contributes little to the outlier risk. If the FT vol is used in 
simulation there will be some overstatement of the risk, but in practice, this 
overstatement is not very significant. 


P The 99.97% CL: This “3 in 10,000" level is a popular confidence level for Economic 
Capital for an AA (Aa) rated company, as will be described in Ch. 39. 
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A refinement of the fat tail approach would be to explicitly model the central 
distribution with another Gaussian and use the composite distribution. However, 
this would probably not add much in practice to the description of outlier risk. 

Note that we are not talking about adding up two Gaussian random noises; 
that would just be Gaussian. What we are talking about is adding together two 
Gaussian probability density functions with different volatilities; the result for 
this sum is not just a Gaussian. 

Actually, it makes more sense to fit distributions in normal and stressed 
regimes separately rather than trying to fit the whole distribution over all times. 
We want a risk measure for stressed regimes here, which we get with fat-tail vols. 


Overlapping vs. Non-overlapping Windows and Data 


Serial Correlation Problem and a Lemma to the Data Theorem 
Serial correlations occur with overlapping data windows. The reader should note 
that, for daily differences using daily data, the windows used to define d,x do 


not overlap, and there is no serial correlation. We recommend the daily difference 
procedure anyway (see discussion on No Credit for Mean Reversion below). 

However, for the choice of differences over more than one day, overlapping 
data windows pose a thorny practical problem. The reader should note that, 
independent of mathematical purity, in the real world there usually is no choice. 
The use of overlapping windows is unavoidable if there are not enough data to 
use independent non-overlapping windows. In fact, we have a Lemma: 

A strong Lemma to the Black-Hole Data Theorem states: “There are not 
enough data points for independent non-overlapping windows, almost 
everywhere". This is not a joke. 


Serial Correlation and a Red Herring 

The serial correlation problem is in one sense a red herring. If differences over 
time window intervals Dt > 1 day are used, numerical experiments show that 
overlapping windows do not substantially degrade the determination of dx at 
a given CL". So the good news is that numerically the use of overlapping 


windows does not have much effect, provided that the size of the windows 1s 
small compared to the total length of the data. 


" Window Caveats: Naturally, the window size has to be small with respect to the total 
data sample for this to work. In addition, the ratio of risks from CL, to CL» for real data 
(not a Gaussian model) has been examined. A similar statement holds: overlapping 
windows do not degrade the analysis as long as the window size is small compared to the 
total data time series length. 
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To test this statement in a simple example, consider a series of 500 Gaussian 
R# (equivalent to 2 years of data). We first look at Dt = 10-day overlapping 
moving windows, add up the 10 random numbers in each window to define 491 


values of d,x, sort-order these values, and thereby extract the gr at each 


possible CL. We then look at independent non-overlapping 10-day windows. 
Naturally, there are fewer CLs that can be examined using independent windows 
(CL =2%,...,98%). There are ten sets of 49 independent windows 
corresponding to the ten possible starting days for each set, and therefore there 
are ten results for qu at each CL. For each CL we plot du from the 


overlapping windows and we plot the average dx" from the ten independent 


windows along with the standard deviations. 
The results are in the graph below: 


Overlapping vs Independent Window 
Statistics With Errors; Gaussian model 


—— Avg. Indep. +- Error —e— Overlapping 
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It is seen that the overlapping window results dx across the CL spectrum 
are within the errors of the ten independent window estimates for d Pd А 


Let's look at the highest CL (98%) available here, х" [Dt = 10]. The 


plot below shows that the results for ten sets of independent windows oscillate 
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around the result for the overlapping windows. This means that the use of 
overlapping windows does not give misleading results. 


98% CL: Indep vs Overlap Windows 


—e— Independent 98% CL —ш— Overlapping 98% CL 
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We conclude with a remark about worst-case moves. Naturally, if we are 
looking at worst case moves within any 10-day window we want to look at 
overlapping windows to get the largest number of possible states. The worst case 
will lie somewhere in the independent sets of non-overlapping independent 
windows. 


What about the Difference Time and Mean Reversion? 

The difference time dt for d,x is not unique. If the series d,x(t) exhibits mean 
reversion, a common assumption, then at larger Dt > dt we get some damping 
d,x[Dt] < 4 Dt/ dt -d,x| dt]. Conservatively however we may not want to allow 


credit for mean reversion in performing tail-risk analysis". This argues for 


'S No Mean Reversion Credit for Traders in a Turbulent Market: Assumptions about 
the existence of mean reversion are nefarious for risk. Attempts by traders to use mean 
reversion to minimize risk should be treated with suspicion. When mean reversion comes 
up in the discussion (as it will), it will go like this: “Markets are mean reverting. That’s 
how we run our business. Therefore your assumption of no mean reversion is way too 
conservative”. You might ask the traders the sarcastic question of how well the 
assumption of mean reversion worked for the convergence plays of the fabulous traders at 
some world-renown hedge funds, before those strategies crashed and burned in 1998. The 
serious issue is that in a turbulent environment corresponding to severe outlier risk - at 
which you do want to look - assumptions of mean reversion and convergence break 
down. There’s no free lunch. 
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square-root-time scaling up from daily differences, ie. dt=1Day and 


d,x|Dt] = урл Day -d,x[1 Day]. 


What about Up/Down Data Move Asymmetry? 


The data for d,x are usually asymmetric for positive vs. negative moves. Two FT 


@) can be chosen independently at the tail CL for positive and 


signed vols Opr 
for negative moves. The signed FT vols can and should be used in MC 


simulations. If a positive (negative) R# is generated, use os) (o ). 


What is the Relation of the Fat Tail Vol with Huge Jumps? 


Naturally, jumps make the empirical volatility change as a function of time. If 
there is a big jump, it will increase the volatility for the time that the jump 
remains in the time window. This is one way to look at the dynamics behind the 
increase in volatility as d,x increases. To illustrate the idea, consider the 


following 500 Monte-Carlo Gaussian changes with unit daily vol, including a big 
10 standard deviation 1-day jump added in the middle of the time series. The 
results for the 65-day windowed vols are shown below’®: 


Windowed Vol Without/With 10 sd jump 


—s— Windowed Vol 2 no jump —-— Windowed Vol 2 with jump 


The windowed vol fluctuates around the input vol of one, except for the 
windows in which the big jump appears. The big jump shows up in the histogram 


16 Label: The label Vol 2 is there for reference in Ch. 25. 
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of d,x at the maximum point, past the 99% CL. Therefore it does not effect the 


determination of the fat tail vol 0” . 


The MaxVol 
We could define a “MaxVol” using the jump in the above paragraph 
ав O yayo =A” | kya Where kya is the number of standard deviations 


corresponding to the maximum CL obtainable from the total time series. For the 
most conservative available historical estimate, this MaxVol can be used. The 
MaxVol corresponds to using a Gaussian that fits the worst-case move. 
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22. Correlation Matrix Formalism and the N - 
Sphere (Tech. Index 8/10) 


The Importance and Difficulty of Correlation Risk 


Correlation risk is one of the most dangerous and least-analyzed risks. In order to 
come to grips with correlation risk with many variables, we need to be able to 
deal with the problem of stressed correlation matrices. Stressed correlations are 
needed for robust risk management because correlations are notoriously 
unstable'. In particular, during stressed markets, correlations often increase 
dramatically’. Therefore, for risk assessment, we need to stress the correlations. 

Unfortunately, the procedure of stressing the correlations is problematic. 
Often, stressed correlation matrices are non-positive definite (VPD) matrices 
because they arise from inconsistent stressing of the various correlation matrix 
elements that are interdependent and constrained’. 

In this section, we discuss the representation of correlations in N dimensions, 
ie. correlation matrices with elements as correlations between pairs of N 
variables. One of the main problems with correlations is that they are dependent. 
By reformulating the problem geometrically, we recast the correlations into 
functions of independent angular variables’. These angles are the natural 
spherical co-ordinates for an WV - sphere where ЛМ = N —1. In this way, the 
problem of describing positive definite (PD) correlation matrices becomes 
tractable. With this, we are then able to deal with stressed correlation matrices. 


' Correlation Risk, Timescales, Stressed Correlations and Stochastic Correlations: 
Correlation risk is insidious and needs to take a more prominent place in risk 
management. Scenario stress analysis or a stochastic model for correlations is needed to 
determine correlation risk. 


? Non-Positive Definite Matrices in Ordinary Data Collection: NPD correlation 
matrices may also arise from inconsistent data, without any stress procedure at all. For 
example some data may be daily and other data only available monthly. Bad data or 
missing data is another problem (see also below). 


? History: My formulation of this geometrical approach was developed in 1993. The idea 
was mentioned in my CIFEr tutorial starting in 1996. I noticed the connection with the 
Cholesky decomposition in 1999. The basic idea came from running Monte Carlo 
simulations of n-dimensional phase space in high-energy particle physics in the 1960’s. 
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In general, an arbitrarily stressed correlation matrix will be non-positive- 
definite (NPD). One idea that we will present in Ch. 24 is to get a best PD matrix 
fit to the NPD matrix, using a least-squares fit in the angular variables. The 
problem can be cast into the geometrical problem of determining the point on an 
N -sphere (corresponding to the PD stressed correlation matrix) that is the 
closest to a point off the V -sphere (the target NPD correlation matrix)’. 

A more recent idea keeps intact the maximum number of target correlations 
in the final correlation matrix using nearest-neighbor “NNR” techniques (see Ch. 
24). 


One Correlation in Two Dimensions 


We will proceed slowly, starting with low dimensional cases and gradually 
building up, making sure that the physical intuitive picture is always present. The 
basic starting point is the recognition that a correlation corresponds to the cosine 


ofanangle p,, = cos(6,, ). These angles are, however, dependent because they 


are constrained by laws of cosines. Our goal is to find a set of independent angles 
to recast the correlations in a consistent manner. 


We start with a set of orthogonal unit vectors {б}, =1...N forming a 
basis in Euclidean N-space". We imagine that the time changes d,x,(f) of the 
first variable? x, are measured along the first axis with unit vector é,. The time 
changes d,x,(f) ofthe second variable are measured at an angle 6,, with respect 
to the first axis in the plane formed by the first two unit vectors (é,,é,). The 
time runs over the time window producing №, time differences for both 
variables d,x, (t, Jd s (4) (e.g. over 3 months). The correlation and angle 6,, 


are given by the usual expression 


^ What we really mean by the “Point on the N-Sphere”: Actually for N variables, there 
are N points forming a configuration on the N-1 unit sphere. For convenience we want to 
refer to a "point" representing all N vectors. For example we could use a centroid point 
for reference. 


5 Orthogonality in 3D: Position your right hand with your thumb pointing up, your index 
finger pointing straight ahead, and your middle finger pointing to the left. 


$ Change of Variables: The variable can be x; = In(r,) for returns, where rı is the 
physical first variable, etc. 
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р = 008(6,,) = sac As (&)-(4x)]| 4x (&)-(4:))] 02-1) 


Here the d,x,(f) time average is (d,x,) and its volatility is o,. Notice that this 
looks exactly like the dot product of two vectors in an №, -dimensional "time" 


space to form the cosine of the angle 6,,. This “time” space is associated with, 
but is not the same as, the space that we are use for the independent angles’. 


Two Correlations in Three Dimensions; the Azimuthal Angle 


Adding another variable x, leads to fluctuations of the time differences d,x, 
partly in the (ё, ,ё,) plane and partly ош of that plane, in the direction ё,. We 
introduce the azimuthal angle @,, is to fix the projection of the 3" variable d А 
on the axis ê, by a rotation in the (é,,é; ) plane by @,,, leaving the projection 
on ё, fixed. We have three correlations pj, = cos(0,), p, = соѕ(0,,), and 


Оз = соѕ(0,,). Each of these angles is measured between the time-averaged 


changes in the corresponding pair of variables. These angles are dependent 
because of the law of cosines, viz 


соѕ(0,, ) = cos(4,, )cos(8,, )+sin (6, )sin(A,;)cos(~,,) (22.2) 


Directly in terms of the correlations, 


£5 = Pas * A17 рь yl- Pis COS(P; ) (22.3) 


This constraint is the essential problem. Imagine what happens if we try to stress 
the correlations in some way. We cannot arbitrarily choose { Pio» Pis s Pa] 


because if we do, the law of cosines will in general not be satisfied for any choice 


7 Fibers: The N,, values of the time changes of each of ће N variables exist along a 
geometrical fiber above the space that we use to describe the correlations. This space of 
the N fibers is N-dimensional. That's it for differential geometry, guys. 

Another Picture: The Nw space (for unit vols) is ће Му sphere in which the N time series 
are unit vectors. Picture the „Ж = № — 1 sphere inside ће Nw sphere. Then V< Ny, is 
clear. I thank M. Bondioli for discussions on this point. 

OTHER NOTATION: In Ch. 38, Nw Ch. 22 > N Ch. 38 and N Ch. 22 > p Ch. 38. 
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of the azimuthal angle 9», € [0, 2л). If the law of cosines is not satisfied, the 


fluctuations of the three variables cannot physically exist*’. 
The clue to the puzzle is already here. The trick is to swap the dependent 


variable p, for the azimuthal angle @,,. That's all we have to do. 

Now if we were in fact to describe the correlations using the variables 
{020303} then we would have an independent set of angles. We could vary 
these angles arbitrarily. Each set of angles corresponds to a definite set of 
correlations { Pus ра) using the above relations, and each set of 
correlations corresponds to an independent set of angles. 

For convenience, we define unit vectors {d, $.] by dividing out the volatility 


from the fluctuations about the average for each variable, namely 
d Ĵa = | dx, - (d,x,) |е, (22.4) 


We can represent each d, ў, (over the collection of times in the time window) by 
a point on the 2-sphere using spherical co-ordinates, "° 


* Missing Data or Bad Data and Non Positive Definite Correlation problems again: 
Say that series x; has a missing or bad data point at some time t*. Then correlations pi», 
різ Of series x2, Хз to series x; will be corrupted and eqn. (22.3) for p23 may not hold for 
any real azimuthal angle фз; if so a negative eigenvalue in the correlation matrix will 
exist. 

For this reason work is needed to “fix” bad data / missing data holes. One promising 
technique is MSSA (Multiple Singular Spectral Analysis) used in geophysics (ref). 


? The Forbidden Hyperbolic Geometry, Relativity, and Imaginary Azimuthal 
Angles: If the law of cosines is satisfied, fluctuations for the x variables are associated 
with points on the 2-sphere in 3-space. Transformations on the 2-sphere producing 
different correlations are described by the rotation group O(3). If the law of cosines is not 
satisfied with a real azimuthal angle, the azimuthal variable will be imaginary. Then 
fluctuations of the x variables are associated with points on an unphysical non-compact 
hyperbolic space with an imaginary axis similar to Einstein's special relativity theory. 
Mathematically, transformations on this hyperbolic space need generalized orthogonal 
groups. In three dimensions, instead of the compact group O(3) we get the non-compact 
group O(2,1). Naturally, all this is forbidden in finance. There, did that wake up all you 
high-energy theorists? 


1 Vectors and Time Series: For notational simplicity we use the same symbol dva to 
represent a time series and a vector in the abstract space. 
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d, y, = ё 
d, y, =cos(6,, )é,  sin(8, )é, (22.5) 
d, 9, = cos(0,,)8, + їп (Ө, )l соз(ф„ )é, + sin (p )é, | 


This gives the connection of the formalism with spherical geometry in 3D. 


The Degenerate World of FX Triangles 


Before going ahead with higher dimensions, we pause for a brief description of 
the degenerate world of FX triangles. The degeneracy is because three variables 
are linearly dependent. Consider the trio of FX rates: 


Units (XYZ) #Units (UVW ) &Units (UVW ) 
XYZ = л сл oc Www > Ny xyz = A vv 


OneUSD Е OneUSD Е OneXYZ 
(22.6) 


Let x, -1n7,4,x, = In yu. X =й уу, үү; = Х, — X. Then the (2,,2,) 
plane contains d,x, = d,x, — d,x, , and this degeneracy produces ,, = 0. We 


find after some algebra р,, = (с, — Dia )/о, > Оз = -(о, -о›\, )/o; j 


and 0; = оү + 05 — 20,0; f, producing 0; = Qi py + yl- рь yl- Pio 
Here is a picture of an FX triangle for the three d,x,, a = 1,2,3. 


Triangle Relation for FX Rates 


й,х, a %, 


t 


П A zero eigenvalue for degenerate FX: $); = 0 produces a zero eigenvalue for this 
correlation matrix. 
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The averages are assumed subtracted. The angles can be determined by the 
historical correlations. Closing the triangle then gives the lengths of the legs, i.e. 
the volatilities expected from the correlations. Conversely, given recent moves of 
the variables, we can look at the angles needed to make the triangle close. 


FX Option-Implied Correlations 
Another version" of the above FX triangle is to consider the lengths of the lines 
as given by implied volatilities с") , æ = 1,2,3. These would be the vols of 


a 
the FX options corresponding to each of the x, variables (naturally at consistent 


strikes and maturities). Then, option-implied correlations are obtained from the 
angles such that the three legs close. 


This Triangle Doesn’t Close 


Of course if think you have a handle on specifying both the volatilities (line 
lengths) and correlations (angles), in general the triangle won’t close. Some FX 
traders may have fun trying to find arbitrage opportunities using this idea. 


Correlations in Four Dimensions - Picture 


Four-Dimensional Geometry for Four Variables with 


Unit-Vector Changes on the 3-Sphere 


< м 


= 


4,» =ê d,y, in (é,,é,) plane 
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We need to extend our formalism from three to N dimensions to get the result for 
an arbitrary correlation matrix using standard spherical co-ordinates. As we add 
another variable, we increase by one the dimensionality of the space and write 
down the extra spherical co-ordinates necessary to describe this new variable. 

We need to introduce a compact notation to keep the equations from running 


off the page. We set C5,, = cos(0,,], $5,, =sin|@ 
Qof ap dap 


n etc. We also put 


subscripts on the brackets like {| E i in order to keep them straight. 

Introducing a fourth variable with similar notation d, y, for the unit vector of 
the changes, we also have to introduce a 4" orthogonal unit vector é,, an extra 
polar angle @,, and two extra azimuthal angles @,,,Q,,. The polar angle 0, 
gives the projection of d, y, on ё. Then we rotate around её by angle @,, in the 
(&,.8,) plane to fix the projection of d, y, on ê. Finally, we rotate by 9, 
about ё, in the (ê, ; é,) plane to fix the projection of d,y, on ё,. The remainder 
is projected on the new dimension ё,. 


Therefore, we get the decomposition of the unit vector d,, for the fourth 
variable as 


257 = Corse + Sora Б + Sor О Шш S sies |. i (22.7) 


We now can reconstruct all the correlation matrix elements in four 
dimensions. They are all given by p,p —d,y,:d,y,. Therefore, we merely 


need to plug in the spherical decompositions. There are three new correlations. 
They are 


pa = Cou 


Poa = Сөз = Cg Cg + SgpSgi Co» 


P34 = Сөз = Соз Сөн + 50138014 [o * SEV 


(22.8) 


The correlations (,,,,, have the same form as what we had in the 3D case for 
Piz» P23- This is not surprising because we could have introduced the third and 


fourth variables in the opposite order. The correlation p}, has a new and more 


complicated form. 
We see by construction that all correlations are specified once we specify the 


angles lo, в \ à {Pap} . Moreover, we can now see a simple sequential algorithm to 
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get the angles from the correlations. For example, we get @,, from the 


correlations { Disc Pins Pa] . Given @,, (апа @,, which we already know), we 


get фз, from the set {Pias Pis» P34» 955; 94j $ 


Correlations in Five and Higher Dimensions 

We now can see the general pattern. We continue to introduce new dimensions 
along with more angles. For five dimensions, we need one more polar angle 0; 
and three more azimuthal angles to specify the position of d, y;. The equations 
for { Das» Pss» Ра) successively determine these new azimuthal angles 


TA O35» P45 \ . The equations for the new correlations in five dimensions are 


fis = Сөзѕ 


P25 = Сө» = Cg Cgis + SoS oisCo2s 


P35 = Сөз» = CosCois У 59135015 [om + S рабы! 


Pas = Сөз = CguCgis + 59148015 ° 


| qui Cons + рэд рэз [orem +534935 Сраѕ | | 


(22.9) 


For example, consider this 5x5 correlation matrix: 


Corr matrix (p) = 


а 
B0 coste = 
B0 cose = 
сон) = 
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In general, for the œ” variable d, ў, we need one more polar angle Ó,, to 
specify the projection on ё, along with 2 —2 more azimuthal angles 
[ON Psar Paia) to specify the projections on the remaining prior axes 
(€,,€;,...€,,). Given d,$, and (in exactly the same way) d,j,, we get the 
correlation as the dot product p,, = d, y, - d, р. 

The counting is that there are N —1 polar angles and (N -D(N -2)/ 2 


azimuthal angles adding up to the total number of correlation matrix elements 
N(N -1)/2. 


The angles (6, в Hos] completely determine the correlations. Conversely, 


the same argument as given above sequentially determines the angles given the 
correlations. The mapping is one to one. 


The B Variables and a Recursion Relation 


We next give a useful recursion relation for y =1,..,@-2 ; 2<а < p . The 


relation can be used to get the general expressions by giving a name for the 
innermost bracket and proceeding outward by recursion. We write 


a, . i a, f 
ger = Cos (Paya ) cos (Ф, ) +510 (Ф, а ) sin (Ф, ) Be (22.10) 
The angles used in the spherical geometry are given by 


cos(8,, ) -2 9 (2< p) 


(22.11) 
cos(9,,)= Bi” (2<a< p) 


All the original correlations are given by 


Pæ = BP = cos (6, )cos (9, ) +sin(6,, sin (б) Л (22.12) 
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Spherical Representation of the Cholesky Decomposition 


The above result is neatly summarized in the following observation. Recall" that 
the Cholesky decomposition of a positive definite correlation matrix is the square 
root of the matrix. But we explicitly construct the correlation matrix by taking the 


square: Py» = 4,9, `4, js. Therefore, we have generated the Cholesky 


decomposition Бу construction. Write the two decompositions 


dj, = Dhe, "diy, P... d,y,— S (e, dijs E. note the orthogonality 


y y 


Я PN ; І ; х 12 ут 1/2 
relation е, -&,, = д, , and identify terms іп p,p = Sle Q ). (Ор » х 
у 
Here, Q is the appropriate rotation to the co-ordinate system that we have 
constructed by hand. After a little index manipulation we get the Cholesky matrix 


elements C, в in spherical co-ordinates, namely 
(22.13) 
The entries of the a” row are the ê, coefficients of all fa, y А, апа the entries of 


the B^ column are the coefficients of d, y в for all {é,}. Explicitly in terms of 


the angles, the Cholesky matrix C is 


( 1 Сө» Сөз Sun Сев 
0 Soir SC, i Sis oap 
0 0 8,5 VP ANC 
С-Ор? = И ; И 923 18 928 ~ 938 (22.14) 

8-1 

0 0 0 0: Sas [Span 
a=2 

0 0 0 0 0 


Note that the equations for the { d, 5} earlier in this chapter can be read off 


the columns. The determinant of this triangular matrix is just the product of the 
diagonals, so the determinant of the full correlation matrix is its square, 
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det (p a V зь. (22.15) 


This determinant is of course positive. All sub determinants also have positive 
determinants. 


The rotation Q can be obtained given the SVD decomposition of the correlation 
matrix, (c.f. Ch. 24), using the diagonal matrix A of eigenvalues and the matrix 
U whose columns are the eigenfunctions of p, namely 


Q-Cp" -CUA "U" (22.16) 


This completes the description of the geometry of the correlation matrix. 


Notes for the N- Sphere 


The complexity of the geometrical approach of the correlations is not as severe as 
it might appear. There are a few technical subtleties !?, but using the recursion 
relations, the programming is actually straightforward. Practically speaking, a 
correlation matrix with hundreds of variables can be decomposed on a single 
machine. 

We also note that we can find explicit representations for the derivatives of 


the correlations О.з with respect to the angles (6 ils [A "n This will be useful 
in the sequel when we construct a least-squares measure to obtain an optimal 


correlation matrix { Pap} relative to a “target” correlation matrix. 


Again, since the angles are all independent, they can be varied independently 
allowing a consistent representation of modifying positive-definite correlation 
matrices in general. 

Finally, note that what for convenience we have been calling a “point” on the 


N-sphere actually refers to the set {d, 5} of points on the N-sphere^". 


12 Sign of the Sine: There is a tricky point regarding the sign of the sine, of an azimuthal 
angle. We have to take the positive branch of the square root + (1-cos 29)!2 and we cannot 
write the expression sin[cos” !(cos)]. This requirement may be due to similar 
conventions in canned numerical routines. Also if some sind < 0, then other { sind} 

terms need to change sign also. 


P? Centroid: The centroid or center-of-gravity of the points could be chosen as one point 
to represent the system of points on the sphere, if desired. 
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Simulation with Realistic Time Series and Arbitrary Correlations 


Realistic MC simulations generating multiple time series that are consistent with 
a given correlation matrix can easily be done using the above formalism. This 
solves the inverse problem of generating time series that satisfy correlations 
specified in advance. Equations (22.5) give the first three equations. The 


^ 


orthonormal unit vectors fe, \ can be chosen arbitrarily. 


One useful procedure is to take the first unit vector ё, (t) at each time as a 


given time series, e.g. from data: 
d$ (t)=ê(t)= dx (t) (22.17) 


The other unit vectors lé, (2) сап be Gaussian random variables. Then 
а>і 


the collection of ће various series can mock up the evolution of a yield curve ог 
other data, while being able to stress the correlations up or down as needed. 

A “sandbox” test lab can be set up in this way to test data hole-filling 
schemes using a technique called MSSA, as described in Ch. 36. 
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23. Stressed Correlations and Random Matrices 
(Tech. Index 5/10) 


In this chapter, we first consider various methods for dealing with a matrix of 
stressed correlations. We start with scenario analysis to define target stressed 
correlations, motivated by data (see also Ch. 37). We introduce the concept of the 
average correlation stress. Naturally, these are target correlation stresses for 
which the correlation matrix will not be positive definite. The technique of 
finding an optimal positive-definite approximation to a non-positive-definite 
target correlation matrix is treated in the next chapter. 

We then show how to generate random correlation matrices using two 
techniques. The first method is a direct application of historical data. Historical 
correlation matrices are in principle positive definite, and in practice are close to 
positive definite. The second method uses historical data to construct a model for 
stochastic random correlation matrices. The model contains Gaussian 
multivariate techniques on the space of angles of the .W -sphere that is 
equivalent to the correlation matrices. This second method always yields a 
positive definite correlation matrix. 


Correlation Stress Scenarios Using Data 


In this section, we look at data for correlations to get an idea of the variability of 


the correlations ( Өз) in practice over many variables. We need this in order to 


perform the stressed correlation matrix analysis. We need to get a target stressed 
p? 


matrix that has the property that individual matrix elements are 
Pag property 


stressed using information from historical data. We will then find the best-fit 


ett 


positive-definite correlation matrix ( Pap to be used in the risk 


analysis, such as the Stressed VAR, as described in Ch. 27. The amount of the 


Jeu 


stress in the correlations for the result ( Pop will in general be less 


(Target) . Е . 
) This is because the various 


than the amount of stress for ( Pop 


correlations are dependently constrained. 
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Therefore, the correlation matrix elements cannot consistently be stressed 


independently. 
(Max) (Min) 


> Pap 
correlation of each pair of variables о, 5 over all windows of a given length 


As one example, we look at the maximum and minimum о 


А D sod is Я 
running over the time series . This is easy to accomplish. We now can get a scale 
(Max,Min) 


for the maximum fluctuation of a correlation using Ао where 
Max,Min) _ (Мах) (Min) 
др! ap = p ap =P ap ` 


We could stress the current correlations? to get the target stressed correlations 


by assuming that the historically biggest change for each correlation is the stress 


target change, i.e. p e pore + ADOS The sign + in this equation 


is the same as the sign of B That is, we stress positive current correlations 


upward towards one and negative current correlations downwards toward minus 
one. This makes sense, since in a stressed environment correlations between risk 
factors tend to become more marked in a signed sense". 


We naturally need 2.76 (Max,Min) i 


constrained in the range [-1, 1]. If Apap 
too big we cut off the stress and impose the [-1, 1] constraint by hand. This 


should be understood in all the correlation stress equations. 


(Avg) 


We can also define the average of the max and min correlations p,,°" as 


pores > i Pu + 25] . A less conservative choice for the target stress is to 


(Мах,Аув) ш) (Фе) — 1 LAP” Min) From this viewpoint, we stress 


= = р ap —p ap 
the correlations by the biggest fluctuations about the historical average. These are 
half as big as the maximum fluctuations. Using this assumption, we define 


use Др, 


1 
pne _ pe г : дө” „Min) (23.1) 


! Time Windows: The time length of the windows is determined by financial relevance. 
See the discussion of correlations and data in Ch. 37. 


? Current Correlations: These are the correlations measured over a standard window 
ordinarily used for risk management. Our goal, again, is to take into account correlation 
variability risk. In order to do this, we need to stress the correlations. 


? Correlation Stresses for Hedges, Pairs Trading: For hedged (long — short) positions 
on underlyings x, and xg with рор > 0, the most loss would occur if рор decreased. For 
pairs trading or convergence trades with pag < 0, the most loss would occur if рор 
increased. These correlation moves are opposite to what is assumed in the text for most 
variables. A more refined risk correlations would take the major correlation risks 
physically into account. 
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We remind the reader that the stressed-correlation target matrix is a scenario, 
and any measure provides a possible target. Since the best-fit positive-definite 
stress will be less than the target stress, it is perhaps better to adopt a conservative 
stress for the target stress. 


The Average-Target-Correlation Stress 


It is simpler to adopt a procedure where the correlations are stressed by a 
(Target) 


fixed average amount Ap,,, 


independent of a, 8. This has the disadvantage 


that the average stress is less accurate than detailed stresses. However, the 
procedure has the distinct advantage that it is easier to implement and easier to 
explain to management. At the end of this section, we will refine this average in 
different ways. We write, again with the + being the same as the sign of 
(Current) (Target) _ | (Current) (Target) 
a5 namely pzp = Pap EA pi > 


In order to get Ap ‚ we can plot the distribution in a, 5 of Ap xd 


and take half the average value over (a, 3}, viz Мы = 1 (App m ) , up 
(м 


to some error that could taken as the width of the Mp distribution. 


(Target) 
Avg 


Exposure-Weighted Definition: I 


It is more logical and relevant for the averaged stressed correlation to receive 
contributions from “more important” variables. The importance can be quantified 


by the exposures* TA . Therefore, a better definition for Mi would be 


the exposure-weighted sum, namely 


arge 1 
Api Уе Ap, (23.2) 


Avg S$ 
az 
Неге Ap,, is the measure chosen to represent the correlation uncertainty and the 


normalization is T = У зе E ?T 
а+В 


^ Exposures: Exposures are measures related to the positions, e.g. DV01. They аге 
described in Ch. 26. 
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Exposure-Weighted Definition: IT 
It is clear that if the product of the exposures is positive бб >0 (two long 


exposures) then a positive correlation change Ap, > (0) will increase risk. 
However if бб < 0 (a long and a short exposure) then a negative correlation 


change Ap, < 0 will increase risk. 


For sell-side applications, this does not matter much because most desks are 
just long the market, and it turns out that the sell-side market correlations tend to 
be positive. 

However for buy-side applications, the factor correlations are roughly 
equally positive and negative. Therefore to be conservative, the product of the 
exposures can be used as a benchmark to determine the signs of the correlation 
changes. Note means that the stressed correlations depend on the portfolio that 
determines the exposures. 


Subset Definitions, Arbitrary Weights 


More refined definitions involve summing over a subset of the possible 
correlation pairs. This can be done to emphasize the correlation instabilities of a 


particular subset S of variables, with sums running over fa, В € S}. For 


example, S could be chosen as all those correlations with correlation values 
above a certain threshold, or as correlations inside a certain sector (e.g. fixed 


income, equities, commodities, emerging markets, etc.). We can also include a 
(Target) 


weight w,,, for example using the exposures. The result for Ap pus is 
dependent on the subset of variables S and the set of weightsw. Setting 


Г = У № в» we have 


а,Вє$ 


1 


Дота) (ур, 5) E улы, Wap Pap (23.3) 
a, pes 


Applying the Subset Average-Target Correlation Stress 

An obvious choice for the application of the subset average target stress can be 
for only those fa, pes ү, leaving the other correlations fixed. However, the 
stress can in fact be applied to a// correlations regardless of whether they are in 


the subset S or not. This is consistent and it may be preferable. There is a good 
reason for this point of view. 
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Consider the result obtained using a subset S of relatively volatile 


correlations. This means that LY opr id (S ) will be larger than the target stress 


(Final) 


Ap Gs? (All) for all variables. Now consider the final average stress До! 


for the final matrix after the least-squares optimization, subtracting the current 


correlation matrix pe ), 
dim Final 7 Ci t 
Ape, = lee inal) — urrent) Q3. 


Газ 


Here, Г = N(N -1)/2. The final stress is generally less than the target stress, so 
ADU < Ap (S). Since also Ap (All) < Ко (S) , it can 
(Final) 


happen that Až may be approximately equal to Аа (All ). That is, 


aiming at a high target correlation stress can lead to consistency with the average 
correlation stress once the positive-definite constraints are imposed ^. 


Pedagogical Example for the Average-Target Correlation Stress 


The first drawing is an illustration of a possible distribution among the different 


(Max) 


(a, 8) for the maximum correlation o; , defined with respect to the various 


choices of the starting times of the fixed-length windows in the data series?. The 


Mi 
maximum correlation P = 


0.1 to 0.2 is the most probable range for this 
illustration. Significant high-correlation tail effects are present all the way out to 


pug = = 1. A few events with pa < 0 can exist". 


? Remark: This situation has occurred in practice. 


€ Acknowledgements: The figures are intended only to provide a pedagogical example. 
They are based on data for № = 100 risk factors in 1999-2000. We thank Citigroup for the 
use of the statistics, run by M. Rodriguez. 


"Correlation Sign Warning: The signs of some correlations are arbitrary. A correlation 
involving an FX rate will change sign depending on the quoting convention, for example 
т = 100 Yen/USD or = 0.01 USD/Yen, because dn/n = – 45/5. This makes sense 
because these definitions correspond to mirror statements about the currencies. See the 
chapter on FX options (Ch. 5). 
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Frequency of Maximum Correlation Values 


| A 


An illustrative distribution of minimum correlations eo defined with 
respect to the various choices of the starting times of the fixed-length windows, 
again for the different choices (a, P ) , is in the next drawing. For the minimum, 


Pu — -0.1 to -0.2 is the most probable range. Tail effects are present, though 


(Max) 
ap " 


Frequency of Minimum Correlation Values 


not to the extent of p 
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The target stress for the correlations can be taken from the max, min 


correlation difference Ap ы; An illustration of this distribution is in the 


drawing below. 


Frequency of Max-Min Correlation Values 


Pmax T Pmin =O 


The most probable range for Max-Min correlation difference Ap ен 


is 
0.2 to 0.3, and again tail effects are present. 
In practice, it turns out that if the target stress of 0.3 is used for all correlations, 


then the final correlation stress, after obtaining a positive-definite matrix, is about 


half this number, AP") 


Ае R 0.15. It also happens that if we were to use a subset 


S containing rather volatile correlation fluctuations, say (Ape (S )) = 0.6, 


along with the fluctuations around the average, then we would get 


ye (s ) = 0.30. Using the volatile correlation subset S as a target, the 


final average correlation stress is close to the target average stress, around the 
average for all variables. 


Stressed Random Correlation Matrices 


We have seen that a variety of correlation stress scenarios can be adopted. A 
statistical treatment of stochastic correlations generalizing the scenario 
assumptions can be carried out in principle. There are at least two ways that 
stochastic correlation matrices can be generated. The first is a direct historical 
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approach, and the second uses the geometrical approach, as presented in the 
previous chapter’. 


Random Correlation Matrices Using Historical Data 
For a direct historical determination of random correlation matrices, we first need 
to adopt a time window of length Л, say 3 months or 65 business days. As 


usual, we often want variables, e.g. x, — Inr, to deal with returns. Then for each 


window starting date ¢, and for variables x,,x, the windowed correlation is 


(23.5) 
j-2 
Here dix, (4) x (ta) x, (n). (45, ) = У ах, (t). and the 
i oN, + j-N,-l 
| -— 1 g f 
variance at f, is 0, = бо : PEL (t, ) = (d,x, ), | З 
The historical correlation matrix ( 55 (t j )) for all {а Y } using the above 


prescription is not guaranteed to be positive definite, but it should be close. The 
usual scheme to be described in Ch. 24 (SVD coupled with throwing out the 
negative or zero eigenvalues coupled with renormalizing the eigenfunctions), but 
without least-squares optimization, is good enough for our purposes. Note that 
we need to go through this procedure for each time f. 

The resulting positive-definite historical correlation matrices are then used to 
get random correlation matrices simply by random selection. Take a set of 
random numbers {R 4 uniformly distributed over the unit interval (0,1) . Also 


say we have N, Then we 


ata 


bin (0,1) into № 


data 


historical correlation matrices at t,, j = Таму 


bins. Throwing the dice, we get the matrix from a particular 


* Third Method for Stressed Correlations Using Principal-Components: There is, 
theoretically, a third method for getting random correlations using principal components. 
The idea is to generate changes in the eigenvalues, along with rotations of the 
eigenfunctions. However, this method is far removed from the changes in the physical 
correlations, because many variables contribute to each principal component. For this 
reason, we prefer the methods in the text dealing directly with correlations. 
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time f, with probability VN, 


ata 


. For a particular random number R# and its 


associated time t у We get 


(ot (R#)) = ( Poo )) (23.6) 


These random matrices ( ps (R #)) can then be used as the stochastic 


correlation matrices in a simulation, for example a stressed VAR simulation 
including stochastic correlations. 


Advantages and Disadvantages of Random Historical Matrices 


The advantage of this method is simplicity. There are some drawbacks. First, the 
length of the data time series limits the number of obtainable correlation 
matrices. Further, if the windows overlap in time, the correlation matrices 
obtained, while random, are dependent. Finally, the extremes in correlations that 
could potentially exist and for which the risk could be very large, may not be 
present (and probably are not present) in the data available or used for the 
analysis. 

We turn next to a possibly more powerful method of obtaining random 
correlation matrices that avoids these difficulties. 


Stochastic Correlation Matrices Using the WV -sphere 


The idea here is that, starting with some base correlation matrix ( Pap ), we 


generate randomly fluctuating new matrices ( pa) from this base matrix. The 


stochastic model for random correlation matrices uses the .7V -sphere 
representation of correlation matrices described in the previous chapter. This 
technique will guarantee that positive definiteness is respected. 


Review of the N - Sphere and the Variables 

Given N variables with correlation matrix ( B. there is an associated V - 
sphere with MV = № —1, embedded іп № -dimensional Euclidean space. We call 
the polar and azimuthal angles describing points on this sphere (6 xe Ou 


For this discussion, the reader does not need the details. 


338 Quantitative Finance and Risk Management 


A one-to-one mapping can be explicitly constructed between a positive- 
definite correlation matrix and these angles. This mapping is just the Cholesky 
decomposition in spherical co-ordinates. Moreover, the mapping can be inverted. 

Thus, we have for each matrix element the representation in terms of the 


angles, namely Pap = Pap | {a} е}. We also have the inverse 
expression for each angle 0j, = cos | C» and 9,, = Pap Pain | in terms 


of {Pap j i 


The Angle Volatilities and the Angle-Angle Correlations 


In order to proceed, we need the "angle volatilities" for the polar angles c (6, 2) 
and the azimuthal angles (Ф.в). We also need the “angle-angle correlations” 
between polar angles ZU pA к between azimuthal angles OARA à. 


and between polar-azimuthal angles p(o, B? Pap ) : 


The angle volatilities and angle-angle correlations can be obtained either 
using a scenario assumption or using historical data. The scenario assumption just 
means that numerical values for these quantities are assigned directly. The 
historical data approach would use input from the historical correlation matrices 
generated using the procedure outlined above, but for non-overlapping time 
windows’. 


The Random Correlation Matrices 
Now the idea of generating the stochastic matrix elements is quite 
straightforward. We use a multivariate Gaussian model to generate the random 
angles for points on the WV -sphere. We then construct the random correlation 
matrix elements. 

Specifically, given the angle volatilities and angle-angle correlations we use a 
Gaussian multivariate model to generate random points 7 CA on the WV - 


sphere. Each point is defined with a given set {R i of random numbers, viz 


: Non-Overlapping Time Windows, Angle Volatilities, and Correlations: If we are 
picking the correlation matrix with the biggest effect we want overlapping time windows 
to increase the phase space of available matrices. Here the windows should be non- 
overlapping. Otherwise, the angle volatilities will be too small. Correlations arising from 
overlapping windows are close to each other only because they use some of the same 
information. The multivariate model can generate arbitrarily many random matrices. 
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Pe = 716, i DEus. We then reconstruct the random correlation 


matrix for a given set {R di , viz pi TE (RH), 


Below is a diagram showing how random correlation matrices can be 
generated using the V -sphere: 


Random Correlation Matrices Using the N -sphere 


Starting point on the N -sphere with angles 


{ ase gs) corresponding to the base 


correlation matrix ( Pas ) 


Random moves on the N- 
sphere using multivariate 
angle random model 


New points on the N -sphere with 


0 New New 


angles { VET \ corresponding to 


. а М 
new correlation matrices ( pu) 


Note that we are not constructing a Gaussian multivariate model for the 
stochastic correlations matrix elements, but rather for the angles. This guarantees 
that every matrix is positive definite, because every point on the WV -sphere 
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given by the angles corresponds to a positive definite matrix". Had we tried to 
model the correlation matrix elements directly, we would get matrices far from 
positive definite that would require considerable computer processing. 


A Point on the Sphere is a Bundle of Points: As mentioned Ch. 22, please note that 
what for simplicity we call a “point” on the N-Sphere is actually a bundle of points. 


24. Optimally Stressed PD Correlation Matrices 
(Tech. Index 7/10) 


In this chapter, we deal with the problem of finding an optimal positive-definite 
(PD) approximation for a given correlation matrix. Consider a non-positive- 


definite (NPD) correlation matrix ( pl. We call NPD matrices "illegal" and 


positive-definite matrices "legal". Correlation matrices that are NPD can arise 
from various sources. As discussed before, stressed correlation matrices are 
desirable to probe correlation risk. Such stressed matrices are produced by 
moving the individual correlation matrix elements 2 from their current values 


Current 


Pup бу amounts До. 
We did present a way in the last chapter for the Аб» to be chosen while 


preserving PD constraints by using the WV - sphere geometrical formalism. 
However, as also described in the last chapter, we might want to move А, 


by hand using (e.g.) historical volatilities of correlations, maximum correlation 
moves, etc. However, the constraints among the various correlations make it 
difficult (read impossible) to preserve the PD constraint. The stressed matrix is 
therefore most probably NPD. 

One good reason to stress the correlations is in order to use them in Monte- 
Carlo simulations generating stressed-correlated movements of underlying 
variables'. However, if a correlation matrix is NPD, it is useless for simulation. 
This is because a NPD matrix has negative eigenvalues and so the real square 
root matrix needed for simulations does not exist”. Hence, a method is needed to 
render NPD matrices positive definite in such a way that the stressed character of 
the correlations is preserved as much as possible. 

Using the MV - sphere geometrical formalism, an optimal technique is 
presented here for finding a legal PD matrix that is the "best approximation" to 


! Stressed VAR Simulation with Stressed Correlations: We will examine the use of 
stressed correlations in simulations when we consider the Stressed VAR in Ch. 27. 


? NPD Matrix Misuse: Of course, a non-positive-definite matrix could be used, e.g., in a 
quadratic-form VAR (c.f. Ch. 26). However, such a procedure would be logically 
inconsistent. If ad-hoc correlation matrices in quadratic-form expressions are used to 
aggregate risk, they should be checked for being positive definite. 
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an illegal NPD target matrix. The NPD matrix is called the “target” because in 
fact we may want to consider correlations stressed in a certain way for physical 
reasons (the target), and we want to get as close to this target condition as 
possible, maintaining the condition of PD legality. 

The idea is to use a two-step procedure’: 


: : А PD;S А sae А : 
1. Get a starting point for the matrix, ( p) 7" that is positive definite using an 


algorithm involving the singular value decomposition SVD, plus an eigenvalue 
and eigenfunction renormalization, as described below. 


2. Perform a least-squares fit moving the matrix elements, always maintaining 
ONE Е s : NPD;Target 
positive definiteness, until we get close to the NPD target matrix ( p) à 


The procedure can be illustrated by the following flow chart: 


Best-Fit PD Matrix to Stressed NPD Target Matrix 


Target Stressed Starting ( p) “SS PD via 


( p) Target Illegal, NPD SVD plus Renormalization 


using 


)” BestFit 


Best-Fit ( p 


Least Squares Optimization 


The technique used to obtain the best-fit approximation to the target 
correlation function proceeds using the geometrical spherical representation 
described in Ch. 22. For the present purposes, it suffices to know that any PD 


correlation matrix between N variables corresponds to a set of N (N –1)/ 2 


angles fo, |е.) onan JV = № –1 sphere іп N dimensions. 


? History: I developed the theory for this two-step procedure for optimally stressed 
positive-definite correlation matrices in 1999-2000. It was implemented numerically by 
Mark Rodriguez and Juan Castresana. 
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Least-Squares Fitting for the Optimal PD Stressed Matrix 


The least squares approach is used to provide a measure for minimizing the 
difference between the target and best-fit matrix as the angles are varied around 
the sphere. This procedure always produces a PD matrix, because any correlation 
matrix parameterized by the .7V - sphere angles is positive definite, as we 
discussed in Ch. 22. The picture gives the idea. 


Best-Fit Positive-Definite Matrix on the WV - Sphere: 


Least-Squares Fitting to a NPD Target Matrix 


Point off N -sphere corresponding to 
illegal NPD target ( p 


) NPD; Target 


Minimum least-squares 
distance between NPD 


Target and PD Best Fit 


"uamuuuuuuuu) e 


Best PD matrix ( p 


ү BestFit 


with 


Best Fit Best Fit 
6 | 


optimal angles { в Pap 


Move on N -sphere to 
get closer to target via 
least squares fitting 


Starting SVD point, angles Cre Qu \ , for 


3 PD; Start 
the starting PD matrix ( p) 


We write the chi-square as a function of the various angles to be moved 
around on the sphere, as follows: 
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N 2 
Y ICHRCE |] = pa Wa pre — fep ICHRCE 1l 
(24.1) 


We move the angles (a, в \ dea] , beginning at the starting point { өү" , Ti \ 
using a least-squares routine until Y is as small as is practical, winding up with 


the best-fit angles ra me Qus Ы : 


Weights in the Least-Squares Fitting 
The weights { Wy} can be chosen to emphasize those correlations that the fit will 


try to best reproduce. It is desirable to choose these correlations where the most 
risk is perceived. An example is to take the weights proportional to the absolute 
value of the product of the exposures of the underlying variables, or the product 
of the fat-tail volatilities’ and the exposures. The variables that do not really 
contribute much to the risk will have small weights, and the fit will not bother to 
get these corresponding irrelevant correlations close to the target. Therefore, we 
can choose either one of the two following expressions: 


Was = aa A (24.2) 


Wap 70,05 му 0,05 КАСА (24.3) 
ap 


Numerical Considerations for Optimal PD Stressed Matrix 


Practically speaking, the number of variables N that can be dealt with are a few 
hundred up to around a thousand at the maximum. The time required to perform 
the least squares search for a few hundred variables can be run overnight. 
Typically, the convergence is rapid for a beginning period and then bogs down’. 


^ Fat-Tail Vols: These result from Gaussian fits to outlier fat tails. Fat-tail volatility is 
described in detail in Ch. 21. 


5 Parallelization: The computer code can be parallelized, which would improve 
convergence. 
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Example of Optimal PD Fit to a NPD Stressed Matrix 


Here is a pedagogical example. The details are unimportant, but the results are 
representative. The first figure?" gives an illustration of the histogram frequency 
of the values of the best-fit stressed correlations (dotted line) vs. unstressed 
correlations (solid line). Consistent with the requirement of positive definiteness, 
correlations can undergo substantial stress. 


Unstressed vs. Best Fit Stressed 


——-—- — Best Fit Stressed 


The stressed target matrix can be obtained by a scenario. As discussed in Ch. 


23, one scenario has all positive (negative) correlations increased (decreased) by 
(Target) 


Aw > Up to the limits [-1,1]. This represents a breakdown in the 


some + Ap 
markets envisioned where correlation increases due to flight-to-quality effects, 
etc. There is a “hole” in this NPD target by construction, where no correlations 


exist for (apf 


p Ag): The target stressed matrix is generally not 


positive definite, containing negative eigenvalues. 
The next drawing illustrates the matrix elements of the NPD target matrix 


( p) Te (solid line) and of the best-fit matrix ( B) din (dotted line) for 


* Data, Acknowledgements: The illustrative figures are based on data from 20 time 
series during 1999-2000, for which I thank Citigroup. The least-squares optimization was 
implemented by M. Rodriguez. 


7 Other Data: A more typical histogram of correlations for many variables (rates, FX, 
commodities, equities etc) is peaked around 0.1 with a width of around +- 0.2. However 
“buy-side” factor model correlations have a different character. The histogram of factor 
model correlations tend to be peaked around zero. 
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(Target) 
Avg 
changes in volatile correlations. The stressed correlations from the NPD matrix 
are reasonably reproduced by the best fit correlations. The hole at small 
correlations in the NPD target is filled in by the best fit. 


the case Ap = 0.3. This amount of correlation stress is representative of 


Stressed Non-Positive-Def. Target vs. 
Positive Def. Best Fit 


——-—- — Best Fit Stressed 


The last figure illustrates the histogram of values of the correlations in the PD 


у” Start 


starting point matrix ( p (solid line) compared with the best-fit stressed 


result (dotted line). The starting matrix is obtained using the SVD procedure 
described in the next section. The least-squares optimal procedure results in more 
stress, closer to the target than the starting matrix. 


Stressed Ist Stage SVD vs. Best Fit 
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Stressed Correlations for Equity, Currency, or Rate Baskets 


The above example illustrates the idea that stressed correlation analysis can be 
done with baskets. This includes many practical deals. The baskets that can be 
profitably analyzed for correlation risk include baskets of equities, baskets of 
currencies, baskets of rates, etc. 

If a multifactor forward-rate model is used, the effect of stressed correlations 
between forward rates on interest-rate derivatives can be examined. 

In practice, the analysis can be done with hundreds of variables. 


SVD Algorithm for the Starting PD Correlation Matrix 


Summary of the Starting Algorithm (SVD + Renormalization) 
A useful algorithm that produces a positive definite correlation matrix ( р)” 


from а non-positive-definite stressed target matrix ( p) uses the Singular 


Value Decomposition (SVD). We can use this to provide the PD starting point in 
the least-squares procedure outlined above. We can also use it to transform NPD 
historical correlation matrices that have some data inconsistencies into legal 
matrices. 

The procedure is as follows*”: 


e Run the Singular Value Decomposition'"" (SVD) to obtain the eigenvalue 
spectrum of the NPD matrix. 

e Set all non-positive eigenvalues to a small positive value. The resulting 
matrix is PD. 

e Renormalize the eigenfunctions to get unit diagonal correlation matrix 
elements and to restore the sum of the eigenvalues to the correct value М. 

e Rerun the SVD to transform the renormalized eigenfunctions from the 
previous step into an orthonormal set. 


* History: This SVD + Renormalization algorithm was formulated independently by me 
in 1998. It turns out to be a common procedure. 


? Acknowledgements: I thank Eduardo Epperlein and Kevin Jian for informative 
discussions on correlations. 


10 The SVD and the Cholesky Decomposition: The Cholesky decomposition cannot 
handle NPD matrices; the SVD can. For PD matrices the results of the two algorithms are 
the same. 
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The Singular Value Decomposition (SVD) 


The SVD is a time-honored method for dealing with NPD matrices. Here, we 
have a symmetric matrix, simplifying the SVD. We write the matrix equation 


( pj = ОЛИ" where A is the diagonal matrix of eigenvalues AO = л 
(а) 


aa? 


and U is the matrix of eigenfunctions y^ ^, the eigenfunctions having 


components yi) = Шы. See Ch. 22 for the relation of SVD with the Cholesky 


decomposition. 


Two Important Technical SVD Points 
There are two important technical details''. First, in our case with a symmetric 


NPD | А ; = 
( p) , the matrix V is the same as U except for one important trap arising 
from a convention. The convention is, for example, in the SVD routine in the 


Numerical Recipes book. If a negative eigenvalue AO <0 exists, the routine 
reassigns the absolute positive value to that eigenvalue and hides the minus sign 
in the matrix V . 

Second, degeneracies may occur in the correlation matrix because identical 
degenerate time series may be present". Some SVD routines do not handle 
degeneracies well. Therefore “Magic Dust" can be added to the degenerate time 
series in order to break the degeneracies”. 


The Renormalization of the Eigenvalues and Eigenfunctions 


The whole problem with a NPD stressed matrix ( р)“ is that it contains 


negative or zero eigenvalues. Therefore, a straightforward trick to restore positive 
definiteness is to renormalize (i.e. change by hand) all non-positive eigenvalues 
of Л to a small positive number є > 0. Although this may seem arbitrary, our 
only goal here is to get some PD matrix to use as a starting point in the least- 


! Psychology: If you don’t know the keys to these details, you can go bonkers. 


? Degenerate Time Series: Degeneracies can occur if not enough data exist, and one 
time series is being used as a proxy for two variables. 


P? “Magic Dust”: This is an added random amount small enough to be past all significant 
digits of the numbers in each copy of the time series producing the degeneracy. This 
addition causes no change in the time series to its existing accuracy and it removes the 
degeneracies. 
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squares fit". Therefore, we replace the diagonal matrix of eigenvalues A by this 
strictly positive diagonal matrix A”, іе. A — AVP. If the number of bad 
non-positive eigenvalues is N,,,, then AP contains the original positive 
eigenvalues of A, along with №, , copies of =. 

Having performed this eigenvalue change, У =U holds in the SVD. 
However, the reconstruction of the correlation matrix through the prescription 
U AUPUT with the new AP matrix is not satisfactory. This is because the 
resulting putative diagonal correlation elements will not be one. 

In order to guarantee the correlation matrix has ones on the diagonal, we need 
to renormalize the eigenfunctions. We therefore renormalize the matrix of 
eigenfunctions U into a new matrix UC" P4). ie, U — [Una | Неге 
pee) =T U 

= a 


aa aa? 


where the renormalization factor Г, is defined as 


pe (24.4) 


a N 1/2 
2 (PD) 
5 ОА | 
b=1 


Note the quantity under the square root is strictly positive; we naturally take the 
positive square root. In addition, the sum effectively stops at N = N —N,,. 
The correlation matrix (for this algorithm so far) is then 


(р)? — qy Unitdiag) д (PD)r y (UnitDiag)T (24.5) 


As can be seen by a one-line calculation, the diagonal elements are now one, 1.e. 


(о) =] , as required. 


The Square-Root Correlation Matrix Using SVD or Cholesky 
In order to run simulations, we need the square root of a PD correlation matrix. 


Now the matrix U "| Di«s) 


we just constructed is not a matrix of eigenfunctions 
of the correlation matrix due to the renormalization factors {r,}. Hence, we 


need to perform a rotation by running the SVD again in order to reorganize the 


l4 VAR: This first-stage SVD + Renormalization technique, without the least-squares 
fitting, is often used for standard VAR applications, where some NPD problems arise 
from somewhat dirty or inconsistent data. 
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PD ; ; 2 
) =U Fina) PDY The final matrix 


expression for (о)? аѕ (p 


U^" 15 the (orthogonal) matrix of eigenfunctions of ( py" 


Then we get the desired square root matrix in the usual way, viz 


da ep pace] UC" Because (о) is indeed а positive 


definite matrix, this square root can be found using either SVD or the Cholesky 
decomposition". 


PD Stressed Correlations by “Walking through the Matrix” 


This section describes an alternative method'® for obtaining stressed PD 
correlation matrices. The idea is to walk through the correlation matrix by steps 
along a path, successively stressing correlations. We ensure that each step is 
taken such that the resulting stressed matrix at that step is PD. Hence, at the first 


: Step 1 
step we stress matrix element о> 


къыз in some row 01 and column 81. Moving 


along a path to step #m, we stress 05%", The allowed stress amount for a 


am,fm * 
matrix element at a given step is naturally constrained by the stresses along the 
chosen path up to that step. In practice, as we walk deeper into the matrix, it turns 
out that the amount of allowed stress for matrix elements becomes more reduced. 


Nearest Neighbor Technique for PD Stressed Correlations 


A “nearest neighbor NNR” technique was invented for stressed PD stressed 
correlations, having the metric of keeping as many correlations as mathematically 
possible equal to the target correlations specified by the user. 

The idea is to move systematically through the target matrix using the N- 
sphere co-ordinates (cf. Ch. 22). If the angles for a correlation matrix element are 


P Dimensional Considerations: However, it appears in practice that the SVD is more 
stable for large dimensions than is the Cholesky decomposition. For this reason, it may be 
preferable to forget about the Cholesky decomposition entirely. 


^ Path Walking vs. Least Squares: The path-walking method was devised by me earlier 
than the least squares approach. Walking the matrix can be useful for specific stress 
scenarios. In particular, if we are most concerned with postulating definite stressed values 
for only a few correlations, this method is useful. Recall that in the least-squares 
approach, we can influence the fit by choosing weights. However, we cannot guarantee 
any specific stressed value for any particular correlation matrix element. On the other 
hand, path walking is clumsy to implement for large matrices and was therefore replaced 
by the more tractable LS approach for corporate VAR applications. 
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within trigonometric bounds, then this matrix element is retained in the final 
(Target) 


stressed correlation matrix. If an “illegal” azimuthal angle Pup 


is found (i.e. 


=) >1 so the angle is not real), this angle is replaced by the average 


leos( of; 
of azimuthal angles — of neighboring matrix elements, viz 

(Target) 
Pap 
real. The boundaries need special attention. This procedure makes the resulting 
matrix positive definite because all angles are physical; c.f. Ch. 22. The algorithm 
is very fast ^". 


> [ж + a / 2. By induction both of these angles are already 
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25. Models for Correlation Dynamics, 
Uncertainties (Tech. Index 6/10) 


In this chapter, we look at some models for the underlying dynamics of market 
correlations, and for correlation uncertainties’. This includes the time dependence 
of correlations. Our point of view is that correlations have an intrinsic meaning, 
independent of historical time series. As an analogy, volatility is often treated as 
a dynamical variable in stochastic volatility models, independent of historical 
time series. In a similar way, correlations can be treated as dynamical variables. 
In the last chapter, we were concerned with a different issue, namely given some 
correlations, to find ways to consistently stress the given correlations. Our 
purpose here is to investigate some possibilities for the underlying dynamics of 
the correlations themselves. 

Since the 1“ edition, the technique of modeling correlations from smoothed 
time series using Singular Spectrum Analysis was developed, as mentioned at the 
end of the chapter with details in Ch. 36. We also describe an extension of 
standard CAPM-like factor models including correlated residual risk in order to 
better describe correlations. 


“Just Make the Correlations Zero” Model; Three Versions 


We start with the simple zero-correlation heuristic in three versions A, B, C а 


Zero Correlations, Version A: "No Reason, No Correlation" 


Some people think that correlations should be set equal to zero if you cannot 
think of a rationale or reason that nonzero correlations should be present. What 
these people effectively have in mind is that scenarios should be put together for 
correlations. That is, if there were no motivation for a particular correlation to be 
nonzero, their scenario would be a default of zero correlation. 


' Numerical Problems Extracting Correlations: In Ch. 37, we deal further with the 
numerically induced instabilities of correlations, from an historical data point of view. 


? Zero-Correlation Adherents: All these reasons (sic) for setting correlations equal to 
zero have been proposed, utilized, and vociferously defended. Note that setting a 
correlation equal to zero is making a strong dynamical assumption of independence. 
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A good response to this attitude is that while correlation scenarios are indeed 
useful, real-world non-zero correlations exist whether or not we have an 
underlying theory to describe them. It is perhaps presumptuous and fallacious to 
think that we are smart enough to give a reason for each non-zero correlation. We 
ignore potentially dangerous correlations at our peril, whether we understand 
why they are non-zero or not. 

An example of correlation danger is provided by failures of diversification. 
Diversification assumes that by holding products in many markets, portfolio 
uncertainties can be reduced to small values through cancellations. However, in a 
stressed market environment, high correlations appear when investors panic, and 
a flight to quality ensues. Various spread products become correlated regardless 
of markets, just because they are in fact spread products that are all being sold, as 
investors collectively rush to safe havens like US Treasuries. The false 
assumption of small correlations led to the demise of famous hedge funds and 
Arb units in 1998. 


Zero Correlations, Version B: “We Need Long Time Scales " 


A variation on the theme of “just make the correlations zero” is to look at very 
long time scales (e.g. years). At such long time scales, historical correlation 
fluctuations can average out to small values. Using long time scales has the added 
apparent theoretical benefit that averaging over long times suppresses windowing 
errors, revealing the stable correlation value if there is one. One defense of this 
approach is that “there is more information in a longer time series, so we ought to 
use long time series to determine the correlation”. This idea is appropriate for 
investors with long time horizons, on the order of years. 

The proponents of this argument make an important point. There is indeed 
more information in longer time series. This information can and should be used 
to generate information about the uncertainties in the correlations. 

The basic problem however is that the underlying assumption of the existence 
of a stable correlation is incorrect. Correlation breakdowns апа instabilities 
involve intrinsic uncertainties independent of the numerical windowing 
uncertainties. The associated problem is that the assumption of long time scales is 
manifestly inappropriate for investors or strategies that have shorter time scales 
(e.g. months or weeks). Therefore, long time series should not be used to estimate 
the correlations themselves. 

The bottom line is that market disasters involving correlation changes do not 
wait for mathematical theorem conditions of long times to be satisfied. In spite of 
windowing noise, there is no choice but to look at correlations over limited time 
intervals. Moreover, there is good evidence that no well-defined underlying 
stable correlations exist. During times of market stress, a “strategy” based on the 
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assumption that there is a long-term stable correlation cannot be carried out 
without considerable financial damage’. 


Zero Correlations, Version C: “Its Too Hard Otherwise" 


Another example of the zero-correlation model is the forced assumption when the 
number of market variables becomes so large that dense correlation matrices 
cannot be treated by the computer software and/or hardware’. In that case, some 
block-diagonal assumptions have to be made, setting the inter-block correlations 
to zero. For example, all intra FX correlations can be put into one block and all 
intra interest-rate correlations in another block, with inter FX-interest rate 
correlations set to zero. 

Such zero-correlation inter-market assumptions will be unable to pick up the 
risk of, for example, an inflationary environment affecting both FX and interest 
rates. The trade-off is that the intra-market correlations are more accurately 
described with large numbers of variables. 


Long-Term vs. Short-Term Correlations; Macro-Micro Model 


In this section, we present a new model for underlying correlation dynamics, 
motivated by the Macro-Micro (MM) model; cf. Ch. 47-51. The idea is that 
macroeconomics provides strong long-term influences on correlations. The 
Macro component of the MM model attempts to model these influences in a 
parsimonious fashion, leaving the task of constructing the real macroeconomic 
underpinnings to the future. The Macro component gives the long-term 
correlations. The Micro component provides for short time scale noise that would 
average out to zero in any windowed measure of long-term correlation. The 
Micro component gives the short-term correlations. 


? What? Trader Angst? Traders, with not much measurable angst over mathematical 
niceties for long time scales, react quickly to big or sudden changes in correlations in the 
markets. In this way, they retain their jobs. 


^ Large Correlation Matrix PD Problems: In particular, the imposition of positive 
definiteness for large correlation matrices becomes problematic when the number of 
variables is above a thousand, as an order of magnitude. In that case, block-diagonal 
assumptions can be used. 
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Macro Long-term Correlation Simulation Example 


To illustrate the idea’, we first deal with a single correlation Pap- Later on we 


shall present a recipe for moving the whole correlation matrix, though with 
simpler dynamics. The Macro component of the MM model for correlations 
proceeds according to the prescription of quasi-random correlation slopes or 
correlation drifts: 


Use a Gaussian probability distribution for the correlation slope or drift. 

e Specify the probability distribution of time intervals between slope changes, 
with minimum time interval defined by a minimum time cutoff. 

e Bound the correlations by +1. 


The Macro Quasi-Random Slope for Correlation Simulation 


The model for the time change of the correlation involves a Macro Correlation 
CorrSlope 
pap 


Slope и (t) . The defining equation reads: 


dp t) Pap t+ dt) ous) = Hoag (tat (25:1) 


CorrSlope 


The quasi-random correlation slope и 


(t) changes the correlation 


Pag аѕ time changes. 


CorrSlope 


The Macro correlation slope 4/7; 


(t) is not a function. It changes in a 
quasi-random fashion at discrete times a. $us | ‚ ав defined next: 

CorrSlope : k 

ШУУ; у (t Е dt) © [1 z x] if t + ai 


D (t) L (25.2) 


pap 
ex CorrSlope CorrSlope : "E (k) 
Slag (2) | i М(и > > О оов ) if t= lChangeSlope 


The idea is shown in the picture below: 


5 Extension of the MM Correlation Model to Correlation Matrices: The application to 
a whole correlation matrix would involve rendering the matrix positive definite, as 
described in previous chapters. This would have to be done at each point in time. 
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Slope of the correlation p,,(t) changes 


with time, in a quasi-random fashion 


Pap (t) 


Change in Macro Correlation 
Slope юре (ї) 


pap 


k 


When 678% changes at some 9 


pu^ Changesiope» the possible new values are 


ee cue Сот Сот 
drawn randomly from a normal distribution N ( Mie pane Opan ) . The normal 


pap 
: CorrSlope : : EE CorrSlope ; 
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The function® oji — x] with y — sgn( В р (ї)) бов (ї) prevents the 


correlation from exceeding +1 if there is no correlation slope change. The 


~ 


function S| Pap (2) | prevents the correlation from exceeding +1 if there is a 


cz 


correlation slope change. We can, for example, specify 3 


S| Pag (2) | =,jl- Pig (t) , which vanishes at +1. 


We are not interested in small time-scale noise, and we therefore set a 


using the ansatz’ 


possible Micro stochastic noise component БЕГА (1)dt 20 in 
Eqn. (25.1). 


The Time Intervals for Macro Correlation Slope Changes 


We need to specify the (discrete) times at which the correlation slope changes. In 
the MM model, the distribution for time-change intervals is taken as (cf. Ch. 50): 


1 At -T : 

(Mass) = V -exp In? | 6000 — Cue (25.4) 
ac 2( MacroTime V? T = т : 
O af ) MacroAvg Cutoff 


Here, Толор is the minimum Macro time interval cutoff ê. The normalization 


I = "i p MecroTime V27) | 


pap 


Example of Quasi-Random Macro Correlation Slope Changes 
Here is a picture of the idea. For illustration, we simplify eq. (25.4) using a 


specific time interval 7,,,,,,,, > i.e. we take (Ао) =O (Atos RU C RD T 


where 6 is the Dirac delta function. The changes of the underlying correlation 
over time were evaluated in a Monte-Carlo simulation using the simplified quasi- 
random Macro correlation slope model. Thus the correlation slope changes in this 
simplified model take place at regular бте intervals, viz 


* Notation: This function is one if the argument is positive and zero otherwise. Also, sgn 
means the sign +- . Finally, note that the argument of рор is at time t, not t — dt. Thus we 
look to see if the result for Pap(t) would be illegal, and if so we impose the constraint. 


7 Motivation: This square-root form of the factor is just phenomenology. Any other 
function vanishing at the endpoints р = +-1 would also be possible. 


* Brownian Limit of the MM Model: As the cutoff time goes to zero, the Macro slope 
model reduces to ordinary Brownian motion with the additional factor limiting the 
correlations inside +- 1. 
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(k) 442) ; ; 
lChangeSlope = I Changeslope + TMacrodve > IN the example, the quasi-random correlation 


slope moves the correlation with a starting value о (f) and long- term 


average ( Pap ). of 0.1. The time 7,,,,,,,, was taken as two months (44 days) 


CorrSlope 


‘dve.p.ap WaS taken as zero. The average correlation over 


and the average slope 4 


time was taken as (Pap) = 0.1. 


Time 


The graph gives one MC realization. 


Underlying Correlation: Macro Model 


—— Nominal Corr with Macro Random Slope 


100% 
c 50% 
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T 0% 4 
м 
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-100% 
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Macro Moves for the Whole Correlation Matrix 


We have been discussing the dynamics of movements in a single correlation 
matrix element. For a whole correlation matrix, we need positive definiteness. 
One idea for correlation changes maintaining positive definiteness uses the 
(initial) 
spherical angle formalism (cf. Ch. 22). The initial correlation matrix | p, s) 


specifies initial angles [2 jm doe on the -sphere with 
f^ 


В>а>і 
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yn 


NW = N - 1. We choose a final correlation matrix | P that specifies final 


angles [2 na oe | on the „2/-$рһеге. Then we interpolate the angles 


from their initial values to their final values linearly with respect to a parameter. 


The parameter can be the time t e(r ишы, E during the total time T while 


the macro move takes place. We get 


| б, ( )| - N (final) _ ) | gen " | M S] | a) 
| о )| = =|( бе _ r) | e, T | A jm | р 1 (05.5) 


Correlation Dependence on Volatility 


A complication that exists in modeling correlations is that correlations can 
depend on volatility. Such a dependency is natural in stressed markets. When 
markets become stressed, volatilities increase, jumps occur, and correlations 
become more pronounced away from zero as investors undergo various forms of 
panic behavior and a flight to quality ensues. 

The behavior of correlations under this circumstance is difficult to extract. 


Consider a jump in a variable x, increasing its volatility. The magnitude and 


sign of the correlation change бо oc Ф," (t):d,x, (t) depends on the 


a 


amount of change in the other variable, i.e. d,x в (t) . 


Numerical Example of Correlation Dependence on Volatility 
We take two variables, use the MM model for the correlation p described above, 


and write a bivariate model for the time changes d,x,, d,x, excluding jumps: 


d,x, = стаі 


25.6 
d,x, = 0, [ +үї— р an |а x 
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The notation is y(t) = dz(t) / dt . We also include the 10 SD jump for d,x, 
over one day in the middle of the time series. 

Below 1s the scatter plot of the windowed correlation vs. the total volatility 
*Vol 2" of d,x, (including the jump and volatility effects on the changes in 
correlation slope). The effect of the jump is clearly visible in this scatter plot. 
Non-trivial dependences between the correlation and the total volatility are seen, 
with a number of string-like continuous regions. 


Vol 2 with Jump vs Correlation; Macro Quasi 
Random Correlation Slopes; Windows 65 days 
5096 
2596 
c 
9 
= o% 
© 
[5] 
-25% 
-50% 
0.0 0.5 1.0 1.5 2.0 
Vol 2 


To take this sort of effect into account in risk management, correlation stress 
analysis is desirable, since no single correlation value is likely to represent a 
stable parameter. 

There is also a feedback on the total volatility of a variable depending on the 
correlation from the correlation instabilities. 


Windowed Correlations with a Macro Quasi-Random Slope 
We can ask the question of how well windowed correlation measurements 


reproduce the given average correlation ( Pa) = 0 in the presence of the 


Time 
Macro quasi-random correlation slopes. The comparison in the figure below is 
between the series labeled “Windowed Corr, No jump" and “Nominal Corr with 
Macro Random Slope" (the latter the same as above). The windowed correlations 
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qualitatively follow the underlying correlation quasi-random behavior, with 
random fluctuations. 

We also exhibit the windowed correlation with the big 10 sd jump as above in 
d,x, to give an idea of windowing correlation uncertainties vs. jump-volatility- 
induced correlation uncertainties. The jump in the underlying produces a clear 
and large jump in the correlation. The MC run exhibited was chosen to illustrate 
this point. Other Monte Carlo runs did not exhibit such a large jump effect. 


Measures of Correlation Instabilities 


—«— Windowed Corr, No jump 
—*— Windowed Corr With jump 


—*— Nominal Corr with Macro Random Slope 
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Should We Forget About Correlations? Answer = No 


Some people think that because there are complications in extracting correlations, 
we should not think about correlations at all. 

The problem really is that correlations are the natural language. If we want to 
think about relations between movements of gold and Libor, we need to consider 
the gold-Libor correlation. Therefore, we cannot forget about correlations. 

Concurrently, we need to worry about correlation uncertainties. Because we 
use models for pricing and risk that contain volatilities and correlations, the 
uncertainties in the correlations must be considered. Considering movements in 
empirically determined correlations is necessary to cope with the theoretical and 
practical problems involving correlations. 
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Can Stochastic Volatility Explain Correlation Instabilities? 


Some people think that stochastic volatility can explain the instabilities in 
correlations. Indeed, we have been exhibiting correlation changes induced by 
jumps. 

However, we do not believe that stochastic volatility can explain, e.g., a flight 
to quality, when many correlations tend to increase in magnitude. If the 
correlation uncertainties were simply due to volatility uncertainties, there would 
be no reason for correlation changes to occur in a correlated fashion. 

In any case, because most models do not use stochastic volatility, we are 
again back to the position of using deterministic volatilities and correlations. 
Whatever the cause of the correlation uncertainties, we need to assess the 
correlation risk in the framework of the models we use. Again, we need to stress 
the correlations to assess this risk. 


Implied, Current, and Historical Correlations for Baskets 


Although this chapter is primarily concerned with direct models for correlations 
between market variables, we consider briefly the connection with implied 
correlations. Implied correlations are found by the inverse procedure of utilizing 
market prices of correlation-dependent securities to back out the correlations. The 
idea is the same as for implied volatilities. While consistency with the market is 
desirable, the difficulty is that correlation-dependent securities are often illiquid. 
Hence, market prices may difficult to obtain, leading to large uncertainties in the 
implied correlations’. 

Market price information, even if available, may be insufficient to determine 
the whole matrix of implied correlations. For example, consider a basket defined 


as B(t) - wS, (1) of securities (stocks, interest rates, etc) all assumed 


lognormal. We saw in Ch. 4 that the basket volatility с, given a reference time 


f, is approximately 


ede (L4 OT | = 3 (o. ,5,.) oun, Jat] В! (25.7) 
€ о,8=1 


? One-sided Markets: For example, the Street may only be selling the product to 
customers without any buying through secondary trading. In that case, there is no market- 
determined “mid point". Estimates of the other side of the market can break down under 
stressed market conditions or "fire sales". 
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Here, B, = B(tj), Sao = Sa (bo) Sao = 5, (tj). Now suppose we approximate 
this using an average correlation, б = Pag» &# В . For large N and assuming 


that Prg > 1/ N , the off-diagonal terms dominate and we approximately get the 


basket vol as proportional to ,/ Pag › VIZ 


N 
O5 & | Pag Fosa fB (25.8) 
a=l 


Given the basket volatility and the independent component volatilities, we can 
approximately extract the average correlation ,/ Pags 


As time progresses, the correlation risk depends on the current correlation. 
Historical correlation uncertainties can be used to estimate the correlation 
hedging risk. Conversely, this estimated hedging risk (or part of it) could be used 
in the pricing of a new deal, thus specifying the initial implied correlation. 


SSA and Noise-Reduced Correlations - Preview 


Since the 1* edition, based on readings from climate science (cf. Ch. 53), a well- 
known geophysics technique called Singular Spectrum Analysis (SSA) was 
applied to finance by Dash and developed at Bloomberg LP. The problems with 
correlation instabilities and noise are ameliorated. 

The basic idea is simple. Imagine a moving average smoothing for a time 
series, removing some high-frequency fluctuations. Now take correlations 
between pairs of smoothed time series. Since the time series noise is suppressed, 
the correlation noise is also suppressed. SSA provides a recipe for moving 
average coefficients that are quite sophisticated, not just simple arithmetic 
averages. 

See Ch. 36 for details. 


Factor Models, Idiosyncratic Risk, and Correlations 


Factor models are used to reduce complexity, commonly employed for portfolio 
management on the buy-side. For examples, portfolios with thousands of stocks 
can be approximately described through cross-sectional regression models' or 
time series regression-based factor models", generalizing the CAPM" (Capital 
Asset Pricing Model) to multiple factors; see". Here we point out that 
improvements on standard factor model assumptions can be made. 
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We call (2s, a —]..N the set of stock unit returns? (with means 


subtracted and volatilities divided out) with return correlation matrix p. We 
call fa А ў » ,B=1...M the set of factor unit returns with return correlation 
ү J. We take M «« N to reduce complexity. We call po” ) the 


correlation matrix between the stock returns and factor returns. 
For time series regressions of stock returns on factors that are (e.g.) stock 


matrix р 


indices", linear algebra gives the factor model stock return gc PP derived 


from the factor returns аз! 2 P. vivi 


M AM ^ idual 
dr mam = 3 p (e) | df, + аў (8% ual) (25.9) 
B.p'-l ВВ 


The residuals fa N are usually considered uncorrelated for different 


stock returns. It is this assumption that we wish to improve. To this end, we take 
correlated idiosyncratic residuals not normally included in factor models, i.e. 


га га 


(as (Residual) ule me +0 (25.10) 


0 Return Definitions: Either time changes for the stock prices or percent changes 
(logarithmic returns). 


' Homework: Stock Returns in Factor Model: To see this, use the expressions in 
terms of unit vectors leading to the Cholesky decomposition as explained in Ch. 22. 


? Financial Predictive Stress and Ulrica’s Stress Prediction: If scenario values are 
inserted for the factor returns (either historical or “what-if”), the equation (with residuals 
ignored) “predicts” stock returns through the factor model. This is called a “predictive 
stress". There is nothing actually predictive in time. 

For a REAL prediction, consider the sorceress Ulrica (speaking to Count Riccardo in 
disguise, whom she has just met in Verdi’s opera Un Ballo in Maschera). Ulrica says: 
Ebben, presto morrai...per man d'un amico, which 1s precisely what happens (ref). 


? Acknowledgements: I thank Eugene Stern for conversations. 


Cross Sectional Models: For cross sectional models, the factor return coefficients or 
“betas” are “loadings” that are fixed in the model. The factor values at each time are 
determined by fitting to the empirical stock returns at that time; subtracting factor values 
for successive times produce the factor returns. 
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The factor returns and residuals are however still assumed to be uncorrelated, 
: а ^ (Residual) u 
viz (ај, d P o. 


Using Eq. (25.9), the stock-return correlation matrix from the factor model, 
(xx) PactorModel) - ( Li D ў d, x GRE 


Paa' 


is of the form 


В.В'=1 


p? (FactorModel) _ Y pz РАЖ " AC (7 du £L ABA (25.1 1) 
Bp" 


Here Ó. , is one if the indices are equal and zero otherwise. The second term is 
ас q 
: А idual А idual | 
from the uncorrelated residuals (d a 8 4 н m which due to Ova 
have no off-diagonal terms. The quantity AC is needed for p | 
(one оп the diagonal); by inspection we get 


[afar mommy] esse 


В.В'=1 


Results with only One Factor 

^ -l 
If M =1 and only one factor f,, the sums over 8, 3' disappear, | p ) = 1, 
апа Eq. (25.11) becomes proms = +(1-5,,) pog . The off- 


diagonal elements of this matrix (the correlations) are py go. Without the 


residual, the correlations would all be one. 


Correlated Idiosyncratic Residuals and Applications 


We now consider a model for correlated residuals, using work of Curnow and 
Dunnett, and of Dash" . We replace the 2'* term of the right hand side of Eq. 
(25.11) by a non-diagonal expression exhibiting off-diagonal residual-residual 
correlations (residuals still being uncorrelated with the factor returns however): 


ААСО э AMAL S (1-6, )с,е, | (25.13) 
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The numbers {c,} ,a=1...N are used to fit the original stock return 


correlation matrix data | aue as closely as possible. Naturally since there 


are N(N —1)/2 correlations but only N parameters, an exact description is not 
possible. However the reverse statement 1s that we can use a small number of 
parameters to get a better handle on a huge intractable correlation matrix. 

These results have been generalized and tested numerically, which will be 
published separately". 


Applications for Correlated Idiosyncratic Residuals 


There are many potential uses for this formalism. For example: 
e Equity counterparty risk simulations (Ch. 31) to obtain risk where the 
modeled correlations are closer to the data. 
e Advanced stress value at risk (cf. Ch. 27) with many variables 
e Error bounds and risk estimates using correlated residuals improve the 
risk model using standard predictive stress. 
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26. Plain-Vanilla VAR and Component VAR 
(Tech. Index 4/10) 


In this chapter, we discuss begin a discussion of VAR, an acronym for Value at 
Risk'. The “Plain-Vanilla VAR” (PV-VAR) - with its incarnations as Monte Carlo 
PV-VAR, Historical VAR, and a quadratic form (QPV-VAR) - 1s a standard risk 
measure that we discuss first. PV-VAR is rather blunt and unrefined’. In the next 
chapter 27, we will discuss refinements. The first refinement stage defines the 
“Improved Plain-Vanilla VAR” (JPV-VAR). Further refinements produce 
“Stressed VAR” (S-VAR) and “Enhanced/Stressed VAR” (ES-VAR) ?. 

The various components comprising the VAR, Component VARs or 
CVARs, will also be discussed. The CVARs are useful because they give a 
consistent picture of the composition of the risk. We show that a CVAR has an 
uncertainty (i.e. there is a Component VAR volatility) and we show how to 
calculate the uncertainty*. The CVAR volatility is useful because it shows the 
uncertainty in different possible compositions for a given total VAR risk. 


Plain-Vanilla Monte-Carlo VAR (PV-VAR) 


We first describe “Plain-Vanilla VAR" (PV-VAR). Basically, PV-VAR is a one- 
step simulator in time that measures risks at a given confidence level of a 
portfolio” C using a variety of simplifying assumptions. The portfolio C with 


' Dignity for Plain-Vanilla VAR: We by no means imply to denigrate the enormous 
and justifiable effort that is needed to implement this risk measure on a firm-wide basis. 
You won’t understand why unless you have been in the middle of it all. It’s not fun. 


? Notation: The names “Plain Vanilla VAR”, “Improved Plain Vanilla VAR”, “Stressed 
VAR”, and “Enhanced/Stressed VAR” are intended to convey a sense of relative 
sophistication. “Stressed VAR” has two definitions (and I believe mine is the original 
definition); see Ch. 27. 


* С means COMPONENT in CVAR in this book: Please note!! Other people mean 
something else when referring to CVAR. 


^ History: The theory of Component VAR volatility described here (and in Ch. 27-29) 
was discovered and developed by me in 1999-2002. 


5 Which Portfolio? The portfolio can be at the level of a product, desk, business-unit, or 
central firm-wide Risk Management. 
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value °С is a function of underlying variables fx, \ ‚а =1...n . The variable x, 
can be a physical variable (e.g. interest rate, stock price, etc.), or x, can be a 
function, commonly x, =Iny,, where y, is the physical variable?. The time 
difference is written d,x,(t)=x,(t+dt)—x,(t), where dt is a convenient 
time unit (not infinitesimal)’. If x, = In y, , the time difference d,x, defines the 
return dy, /y, over time dt. The probability distribution function of the time 
differences is modeled; the usual assumption is multivariate Gaussian. The 


volatilities {о \ and correlations { Pag} occurring therein are obtained from 


(hopefully clean) data’. 


1 39 $ Я . 
A risk “exposure” denoted as °& describes a change in value!’ of the 


portfolio °C fora given change in the underlying x,. A few examples of 


exposures include bond DV01 and spread risks, the Greeks for options, mortgage 
prepayment risk...''. For example the exposure delta of a swap, multiplied by a 
forward-rates change, moves the swap value. See Chapter 8 (Eq. 8.1) for details. 
The risk exposures must be obtained from models or other sources". The first 


* Principal Component VAR: Another possibility is to take the variables х, , or some of 
them, as principal components (PC). One example is to take the leading PCs for the yield 
curve, or to add a few PCs orthogonal to the parallel yield-curve shift that is already 
present as DVO1. This PC-VAR idea is in my notes from 1999. Unfortunately, there is 
not enough space in the book to include it, but the reader can fill in the blanks. 


7 Risk for Time Interval dt: Popular choices are 1 day, 10 (business) days, or 1 quarter. 
For the Gaussian assumption of the dix distribution, the sqrt(dt) scaling translates between 
these choices. Sometimes actual differences are used, e.g. dt = 10 days, without scaling. 


* Getting the Underlying Data: Obtaining appropriate and clean data can be a huge 
issue. One of the two irreverent but big theorems in this book is “Data = Black Hole". 


? Disclaimer: No information regarding numerical values of position exposures of any 
firm is in this book. 


10 USD: By convention here, the US dollar is used here as the reporting currency. The 
VAR calculated in USD would include FX risk for assets held in non-USD currencies. 


1 Idiosyncratic Risks: Some risks that are called idiosyncratic in this book cannot or are 
not obtained using models, exposures, and underlying data. These risks can be large. 
Idiosyncratic risks must be estimated using judgment if they are to be included. They are 
by definition left out of the plain-vanilla VAR. Examples are discussed in the Ch. 27 
along with a method of including idiosyncratic risks in a more refined version of VAR. 


? Getting the Exposures: Getting the exposures for a large institution can be a huge 
issue. If models from the trading desks are used, feeds for the exposures must be provided 
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approximation is the “linear” assumption, id = С / Ox, , which ignores the 
second derivative, or convexity”. 

Because the portfolio C changes all the time as securities are bought and 
sold, a time t, 


rozen 


is chosen when a “snapshot” of the portfolio is taken and the 
calculations are performed on this “frozen” portfolio'^". 

The VAR calculation is performed using a Monte-Carlo (MC) simulation, or 
in the simplest case, a quadratic approximation. The calculation is done to a 
given confidence level (CL) of loss. We call А, the number of standard 
deviations (sd) corresponding to the given CL. Different assumptions are made 
corresponding to the application. For standard reporting, a 99% CL is often used 


(so К, = 2.33). For Economic Capital, which we discuss in Ch. 39, a more 


stringent CL is used (e.g. 99.97% for an AA firm, with Кү, = 3.43). 


Calculations of VAR" can be done frequently or infrequently, depending on 
the need and other considerations 1”. 


to firm-wide Risk Management. If Risk Management desires independent models, these 
models must first be constructed, which is an even bigger issue. 


P? Convexity: The contribution of convexity depends on the situation. Bonds do not have 
much convexity 1f they do not contain strong call features. Therefore, for a bond desk, 
ignoring convexity may not matter much. The same statement holds for plain-vanilla 
swaps. On the other hand, ignoring convexity 1s a terrible approximation for an options 
desk. We discuss ways of including convexity in the next chapter. 


" What Time for the Portfolio? Usually the frozen time is specified by management; 
e.g. end of quarter, daily, etc. If the portfolio at the frozen time is not representative, the 
risk calculated will not be representative either. We can also envision portfolio changes 
through exposure scenarios in the future. We describe this idea in the next chapter. 


5 No Life-Cycle Events in VAR: Life-cycle events include option exercise, cash flow 
payments, etc. These are not included in VAR; the portfolio really is “frozen” in time, 
even though the changes in the underlying variables is taken over finite time dt. This 
means the calculations are simpler than for real-life P&L attribution over time dt, where 
life cycle events are taken into account. 


^ Acknowledgement: I thank D. Humphreys for many discussions and calculations of 
VAR during the period that my Quantitative Analysis group was responsible for 
producing the VAR at Smith Barney. 


" Frequency and Timeliness of VAR Calculations: Variables include the availability 
of data and exposures, the calculational engine efficiency, the amount of budget and 
manpower dedicated, etc. The timeliness depends for what purpose the calculation is 
used. If the VAR is used as an indicative risk benchmark for reporting purposes, 
timeliness 1s not as important as if it is used as a qualitative benchmark for active risk 
management. For active risk management, stale positions and data are about as useful as 
an old newspaper. 
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VAR, as commonly defined, is supposed to reflect risk measured as the 
difference of the total loss and the usual or expected loss. VAR thus is supposed 
to represent an unusual, rather rapid, fluctuation in the markets leading to a bad 
loss. The expected risk is assumed to be constant or slowly varying in time, 
although in reality the time dependence of the expected risk may present a 
substantial risk itself'®. 

We examine this sort of “trend risk” in detail later in the book when we 
discuss the *Macro-Micro" model. 


Historical VAR (HVAR) and Monte Carlo HVAR 


Historical VAR or HVAR is often used, along a variant called Monte-Carlo 
Historical VAR ог MC HVARP"*, 


HVAR or Historical VAR 


HVAR or Historical VAR is an historical *simulation" that chooses the potential 
changes in the underlying variables from a historical set of such changes over 
time. These historical changes are applied to the current values of the underlying 
variables, all with a frozen portfolio. HVAR has the advantage that no 
assumptions about distributions have to be made. The HVAR at a definite 
confidence level corresponds to a definite date in the past, so drill-down to see 
the details of the risks is well defined and easy to see. HVAR has the 
disadvantage that the number of states and their composition are limited to the 
available historical data. 


HVAR also requires that proxies be obtained for many situations where the 
securities do not have direct data. Two procedures are employed: (1) securities 


'S Expected Losses, Pricing, Reserves and Economics: Expected-loss risk is not 
included in VAR. Expected losses are supposed to be included in pricing, expected-loss 
reserves, etc. If appropriate reserves do not exist or do not reasonably reflect expected 
losses, a mistake in the firm's risk assessment will be made. 

One problem is that changing economic conditions can and probably will change the 
expected losses. In addition, the average loss in the MC simulation used to get the VAR 
may or may not reasonably reflect the expected loss in reality. Therefore, a real risk is 
involved in the seemingly harmless academic assumption that unexpected risk is 
deviation from the mean, because the mean can also change in unexpected ways for 
which provisions were not taken. 

This 1s just another example of failure of simple assumptions for practical risk 
management. 


? Acknowledgement: I thank Eugene Stern for pointing out the equivalence of MC 
HVAR and ordinary MC VAR to me. 
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can be built up through curve modeling, and/or else (2) direct substitutes can be 
employed, e.g. for bonds. 


Here is the HVAR formalism. Historical simulation over the VAR period 
(e.g. 1 year ог 250 days before f.) on the frozen portfolio C at the time of 


analysis f, is used. Underlying variable changes TEJ (t, )} are added to the 
current underlyings {x,(t,)| to get the simulation scenarios. The portfolio is 


recalculated for each scenario. The changes fa С (4)] from the initial portfolio 


value are rank ordered. HVAR is d^C с, at the desired confidence level CL”. 


Monte Carlo HVAR 


MC HVAR uses as inputs random Gaussian combinations of historical returns 
(instead of just the historical returns, as for standard HVAR). A given set of 
Gaussian random numbers is used for the linear combination of ALL returns to 
define a given random state. So for a given state, e.g., we take a random linear 
combination of historical SPX returns and the same random linear combination 
of historical Libor returns. Using the fact that different infinite sequences of 
Gaussian random variables are orthogonal, correlations are preserved by this 
approach, so MC HVAR is really just ordinary MC VAR in disguise. 


To demonstrate, denote Gaussian random numbers EU for historical 
times {t,}, | = 1...N and MC HVAR states N=1...4. The MC HVAR return 
qu in state М for the variable labeled @ is by definition the Gaussian 


[74 


random-number-weighted linear combination of data returns fa (Pa) (t i | s 
N 
Y Pa xr) (26.1) 


The MC HVAR correlation ons) 


is defined as 
op 


? HVAR date for the given Confidence Level: For example if the CL = 99% for one 
year HVAR, the 2" worst loss can be used. This picks out a definite transition, e.g. 
between the 43" and 42™ days before the analysis time to. 
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a HVAR) р (26.2) 
Using the relation ly B4 х б. valid for large A, we get”! 
Nel 
MC _HVAR) (Da a Data ata 
pu xil Jr jg x (t) = po (26.3) 


Eq. (26.3) says the MC HVAR correlations are the ordinary data correlations, 
up to finite statistics. Hence an advantage is that the correlation matrix does not 
have to be explicitly calculated, which is useful when one has tens of thousands 
of variables. 


One disadvantage is that it is not easy to modify the result for Stressed VAR 
including stressed correlations, as discussed in Ch. 27. 


Hybrid HVAR + MCVAR 


A hybrid HVAR + MCVAR can be constructed. The advantage of this approach 
is that fat tails (not present in ordinary Gaussian MCVAR) can be incorporated. 
There is a mixing angle parameter V specifying the hybrid approach. 


Consider the data return db (e) as for HVAR, but at a randomly 


selected time f with Š=1...N . Get the MCVAR return d ЕК from Eq. (26.1) 


with N=. Then the hybrid return ax p to be used in the Hybrid 


HVAR+MCVAR is defined as a linear combination of these two returns, viz 


(нуына) _ 
d, Xa 


=cos¥. qj alte £e sin d x (26.4) 


?' Homework: Generate two columns of Gaussian random numbers in Excel, multiply 
the columns together, add up the results, and plot the sum as a function of the length A of 
the column. What do you see? You remember how to get Gaussian random numbers, 
right? See ch 21. 
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Scenarios 


Scenarios of various types are used to obtain risk measures in addition to VAR. 
Sometimes scenario analysis is used for a variable dx, This means that one 


value (or a small set of values) is assumed for X The loss of the portfolio is 
then calculated using the scenario. This is a degenerate form of MC simulation 
where the randomness is replaced by the certainty of the scenario. The scenario 
for dx, can be taken from data as the worst-case move over some time window, 
or the move at some high confidence level CL, or the moves during some past 
crisis. What-if scenarios are also used (as in “What if х were to decrease by 
20%”). 

Scenario-based risk cannot generally be consistently evaluated at a given 
overall portfolio risk CL. However, scenarios are necessary if data do not exist 
for dx, , as we discuss later for idiosyncratic risk 

The Stressed VAR using fat-tail vols and stressed correlations 1s however an 
exception — here, scenarios are used for the definition of the tails and the 
definitions of stresses in the correlations, and the calculation is then performed in 


a mathematically consistent way using the ordinary VAR mathematical 
framework. See Ch. 27. 

Sometimes for simplicity scenarios are adopted for certain "core factor" 
variables, and scenarios for other variables are obtained via correlations to the 
core factors; these are called “Predictive Stress Scenarios". Idiosyncratic risk can 
be added to get better consistency with the correlation matrix data (cf. Ch. 25). 

There are also recently-established regulator-based scenarios that must be 
carried out for large financial institutions. See Ch. 31. 


Scenario Uncertainties 


Scenarios could be improved using uncertainties: “What if х, were to decrease 
by 20% +3%?” Historical or MC analysis could be used to evaluate the 
probability P, that x, changes by a given amount Dx, plus/minus a specified 
uncertainty с,. For example if there are Ne= 10^ MC trials and if the 


probability is P, =10% for a given change within a given uncertainty, then we 


would expect one state N, , with portfolio value change ôv (N) If this 


procedure were systematically carried out, portfolio changes could be ranked, and 
tagged with the states, to get the risk at a given confidence level in a scenario- 
based context. In that way, a scenario with uncertainty could be associated with a 
portfolio risk confidence level, within a risk confidence level uncertainty. 
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Quadratic VAR and Component VARs (CVARs) 


This simplest of all versions of VAR uses a quadratic form SVAR2"" made up 
of the exposures, volatilities and correlations to get risk at an overall confidence 
level (CL) corresponding to Кү, standard deviations (SD) for time interval df. 
This ignores convexity and uses the other assumptions mentioned above. For a 
quick review, we write d,x, = o,dz, with volatility ^^ 


drift term that does not (by definition) contribute to the VAR”. Here the 


O,- This ignores the 


correlated Gaussian measures satisfy” (dz dz 2) = ppt. The expectation of 
the product of the time changes is (ах, dix "m = Oy 0,50 pt . 
We set the total portfolio value change d?C to its linear approximation: 


n 


d*C х Y (0°C ax, dx, , 8, 29 C[àx, (26.5) 
(ecT) x [vago li dt (26.6) 


Eq. (26.6) gives the portfolio change variance. *VAR& is the “quadratic VAR” 
total portfolio volatility, given by the quadratic-form”® 


2 Volatility Specification for VAR: This volatility can be taken as the ordinary 
volatility or as a fat-tail vol, as described in Ch. 21, and as elaborated in the next chapter. 


? Skew Risk: Skew for European option vols can be taken into account in an 
approximate way (see Ch. 6). This is possible for FX and equity options where a “vol 
surface" exists (option exercise time and strike). It is much more difficult for interest rate 
options where a “vol cube" exists (option exercise time, time-to-maturity, strike). 


? Subtracting the Mean - Is this a good idea? Subtracting the mean so «dix,» = 0 
implies there is no risk in the drift/trend. Even though sanctified by tradition, this is 
wrong in the real world. Trend risk exists and is discussed later in the book in the section 
on the Macro-Micro model. 


? Expectation Values of Products of Gaussian Measures: The textbook result holds 
only in the limit as we generate an infinite number of Gaussian random numbers. Finite 
Monte-Carlo simulations produce fluctuations about this theoretical limit. 


°° Homework: It is not hard to get the result for quadratic-form VAR. How about if you 


try it? 
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s 1/2 
удро“ -| $80 y 6,0 | (26.7) 


Note that the correlation sensitivity of the quadratic VAR is 


o( vage) 


—é0.°6.0 (26.8) 
OP, 5 


The QPV-VAR result, defined as Кү, standard deviations of this total 
portfolio quadratic- VAR volatility over time @ , is 


SVAR?” =k, vdt $ РАВО" (26.9) 


Squaring the *VAR® equation and isolating the sum b3 produces 


а=1 


VARS = СУХО (26.10) 


а=1 


Here ^C АК?“ ће quadratic-plain-vanilla “component VAR", QPV-CVAR, 
is defined as 


OQ VAR?" 
S £o; ТД эз 0611) 


[24 


$ 
ó,0, 
*CVARP" = T 
РАВ 
Unlike the positive SVAR9"" а CVAR can be either positive or negative, 


depending on the signs of the exposures and correlations. This point will come up 
again when we discuss the allocation of risk. 
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Monte-Carlo VAR 


We can relate the above QPV-VAR to a Monte Carlo simulation at the overall 
portfolio risk CL with Кү, standard deviations *’. We use independent Gaussian 


measures by taking the square root of the correlation matrix?" as 
4 => p d (26.12) 
y=l 


Here ш | are independent Gaussian random variables satisfying 


(qr pco, di (26.13) 


The States and Portfolio Changes of the MC Simulation 


From а MC simulation, we obtain N „c different states (№) for the movements 


of the underlying variables that are generated by different values of the random 
numbers. For example, a given state iN might have a move up of +12 bp for 5- 
year Libor swaps, a move down of -24 points for the S&P 500, etc. 


We look at the histogram of portfolio changes fa M^ m) for the different 


states, viz 


de® 2 V 54 (dx, Y = У su, c, p, dz" (26.14) 
а=1 1 


а,у= 


27 Confidence Levels for the Total Portfolio VAR and For Individual Variables: Note 
that kc, is the number of SD for the overall CL of the total portfolio risk. This overall CL 
can be attained if each volatility o, is replaced by kct Oa. However, in a Monte Carlo 
calculation, the number of sd Кет; for each variable dix, is different. Moreover, this 
Ке; varies with the MC run, as explained in the text below. 


28 Correlation Matrix Square Root, the Cholesky Decomposition, the SVD, and 
Block Diagonalization: The story of taking the square root of the correlation matrix 1s 
long and grimy. Because the correlation matrix taken from the data contains data noise, it 
is rarely (read never) positive definite. Therefore, a real square-root matrix does not exist 
and the Cholesky decomposition breaks down. Instead, a procedure using the Singular 
Value Decomposition and optimal least-squares fitting can be used, as described in Ch. 
24. For large matrices involving many variables (e.g. thousands), optimization can only 
be performed crudely. Further, because of machine memory and other limitations, 
assuming a block-diagonal form may be required. 
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Next, we pick out the value of d °C cı at the specified CL corresponding to 
the particular state N (ker ) producing the total loss at that CL. For example, for 


the 99% CL and л = 10,000 states, we pick out the 10" worst loss. This 
reproduces the above QPV-VAR result under the linear assumption. Up to 
standard MC noise of O( Nyc y^ , we have 


d*C,, = ke, Vat $ VAR?” (26.15) 


Hence, in the limit of an infinite MC simulation, in the linear approximation, 
we recover the QPV-VAR result, lim, ФС, — УАК". While it may 
MC 


seem that this 1s like using a hammer to swat a fly, MC simulation is the only 
method available when we start refining the VAR assumptions. 


Backtesting VAR 


A backtesting procedure is often used to provide a test of the daily VAR 
calculations, і.е. dt = 1 day. We will illustrate the procedure on HVAR”. The 


frozen portfolio at the time of the HVAR analysis f, is again used. The portfolio 
is re-evaluated by changing the Ix, (4) underlyings by fa Xa (4)] to the 
values Ix, (i, + 1)} observed the next day, i.e. dx, (%) =X, (t, t 1) eU (4) is 
added to х, CAR The resulting change in the portfolio ФС (t) is compared to 
the HVAR amount d°C Рр (t) also calculated at ¢,. An exception is recorded if 


Pc (t, ) > |С le J This comparison is made for each f, over a calendar 


year of 250 days. 

If the CL is 99%, then the backtesting “works” and is consistent provided 
that there are no more than two or three "exceptions". If there are too many 
exceptions, the VAR is not stringent enough. 


? Acknowledgements: I thank Nora Omarova for a discussion on this and many other 
practical risk topics. 
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Component VAR (CVAR) and CVAR Volatility from MC 


We now discuss the CVARs generated by MC simulation. The MC state 
N (Kez ) that corresponds to the overall CL of ko, standard deviations in the 


total VAR contains values of all the variables (ах, yon In the state 
N (ker ) , we have by definition the result for the change in the portfolio at this 


CL, given the exposures aD \ and the changes ах, ү , viz 


a 


dC, = Y 58, (dx) (26.16) 

а=1 
Now ће VAR is, by construction, the sum of the CVARs. Hence, by 
identifying terms, the CVARs in the state (дс, ) аге just the terms in the sum 


eq. (26.16) for d?C,, . That is, 


(Sevan) ваа уе (26.17) 


Non-Uniqueness of the CVARs 
However, a given CVAR generated by MC simulation is not unique, even with an 
infinite number of states {N}. This is because a given fixed value of a sum can 


have different amounts of the components making up the sum no matter how 
accurately the sum is generated ”. 

To see why the CVARs are not unique, we define Monte-Carlo runs labeled 
with an index 27. To emphasize that the CVAR uncertainty has nothing to do 
with MC noise, we specify that each run A is comprised of an infinite number 


of states {М}. Exactly the same value а, at the given Жү, can be 


generated, but with different states № „ (ko ) , for different runs. A given variable 


d,x, will have different values (d,x, yn in different runs 2, all for a given 


? Grocery Bag Components Analogy for CVAR Uncertainty: A useful and visual 
analogy for the CVAR variability is that a given and exact total amount of money like 
$80.00 (VAR) can a-priori be spent buying different amounts (variability) of different 
things (CVARs). A drawback of this analogy is that negative CVARs are hard to picture. 
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total d?C,, . A given term, i.e. a given CVAR, in the sum 3. 5 (а,х ae 


ta 
a=l 


is different for different runs. 


Summary for MC Simulation 
Summarizing, take a MC run 2 and the state N} (ka) at which d^C,, has 
the specified CL corresponding to К, SD. Then the CVAR for variable d,x, as 


Wake; А E 
j? h is given by 


generated in that particular MC run, called (^ CVAR, 


j^ (Ka) 


(SCVAR, = (dx; Jor For the MC run 2, the change in 


portfolio value d °C cy at the specified CL is given by 


d*C,, = Y 56, (ах, ) = Y. C CVAR, J"* ова) 


a=l а=1 


Connection of the Monte-Carlo and Quadratic CVARs 


(Ха, 
) “ M with the quadratic 


The connection of the Monte-Carlo CVAR, (°С VAR, 


CVAR defined above is accomplished by averaging over runs and inserting the 


factor Ке, vdt , viz 


ke, Ndt $ СУАР“ = Ave | ^ CTAR, je | (26.19) 
R 


The CVAR (Component VAR) Volatility 
Because different runs produce different amounts of a given component risk, we 
want to define the volatility for a given *CVAR,. The CVAR-volatility 


$ EE $ Ке (Ке) 
o( С VAR, ) can be defined by the standard deviation of ( G VAR, ) 


over different MC runs, viz 


ater gren] ren 


R 
(26.20) 
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We stress again that the CVAR volatility implying the CVAR uncertainty has 
nothing whatever to do with uncertainty due to finite-statistics Monte-Carlo 
noise. MC noise exists independently of (and in addition to) the uncertainty we 
have been discussing. 

In Ch. 28 we give a closed form expression for the CVAR volatility using 
functional methods, consistent with an infinite number of states. 


Obtaining CVAR Volatilities in Practice; the “Ergodic Trick” 


In order to get the CVAR volatility numerically, we use an “ergodic trick”. 
Rather than producing many MC runs (which would be prohibitive numerically), 
we note that for a Gaussian model, we can connect the total portfolio losses at 
two different confidence levels CL and CL' by 


ke, 


dics 
de c» 


dC (26.21) 


Thus, for a given CL generated at state N (kor) we can take some neighboring 
states {N (ker )} at different CL' and scale up each (d,x, yee) by ker /ker - 


For example to scale up (d,x, yes at CL'— 9896 to CL = 99% we would 


just use Ке, /k.,. = 2.33/2.05. We then calculate the CVARs for these scaled 


neighboring states*'. The CVAR volatility is defined as the standard deviation of 
these CVARs. See the Ch. 38 for more mathematical details. 


Confidence Levels for Individual Variables in VAR 


Consider the confidence level CL, (ka) for an individual variable, i.e. for the 


change (d,x, pre in the state N (ke ). This CL, (ko) is not the same as the 


confidence level К, for the total VAR. For example, in the 99% CL state for the 


total VAR, it can happen that a particular spread moves up by 1 SD, representing 
the 84% CL for that spread. 


?! Neighboring States and Convexity: Convexity corrections will be roughly equal for 
neighboring states, provided the convexity is not too large. In practice the ergodic trick 
works reasonably well in the presence of some convexity. 


Chapter 26: Plain-Vanilla VAR 383 


Generally, given each variable change (dx. yen in the state (кс, ) 


producing ће VAR at ko, we can get ће number k Чо) of SD using 
(d x p 


ta 


yee = koc. Then the confidence level CL, (ka) for (d x 


ta 


can be found as the cumulative normal distribution at the value кб) Я 


Individual Variables generally do NOT contribute at High CL 
It turns out in practice that even for very high CL for VAR like 99.97% where 
ke, = 3.43 , individual CL, (Ke; ) values for diversified portfolios are usually at 


or below the 99% CL, corresponding to individual k o) < 2.33. This is an 


important point, and it means that the “worst-case” moves for individual 
variables do not contribute to the VAR even at a high overall CL”. 


Asset Class VAR (AC_VAR) 


AC VAR is defined as the risk due to one risk factor at a given confidence level, 
independent of the other risk factors. Thus AC VAR is essentially a standalone 
component VAR. The sum of the component VARs is the total VAR (in a linear 
world), but the sum of asset-class VARs is not the total VAR. In an historical 
simulation, the asset class VARs come from different dates. The component 
VARs all come from one date, namely the date that produces the total VAR. 


Marginal VAR (MVAR) 
Marginal VAR" is the change of the VAR between two portfolios, a base 
portfolio C and a modified portfolio С“. The portfolio С“, with exposures 


{eo}, contains an extra deal (or deals) with exposures а The 
original portfolio C has exposures CE We have e аф е Ре +6 .To 


get the VAR for С“ at a given CL (e.g. CL = 99% with Кы, = 2.33) we need 
to find the 1% worst loss d°C corresponding to the state N^? (ka) 


Consider the linear case. We have by definition 


? Exception — Lack of Diversification: The exception is when a single exposure 
dominates the calculation. In that case, the CL for the variable corresponding to the 
dominant single exposure will tend to be around the CL for the total VAR. This is 
because the total risk just degenerates into the single exposure risk. We will see the same 
phenomenon when we discuss issuer credit risk if there is a dominant exposure to a single 
issuer in the portfolio. 
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dow dy igo (ах үл 


Со 


(26.22) 


а=1 


Z к 
We already got all changes for the original portfolio d°C® = у ue (4 x.) 


a=l 
for all states {x} , which we previously used to get d°C cy in the state (к). 


We do not have to recalculate anything for the original portfolio. We calculate 


n 
" N 
extra deal changes, ФС" P9009 — У зл (4 5) in all states {x}. 


a=l 


For each state N, we obtain d°C PM = ДС Dena), gE We look for 


the 1% worst loss, the VAR d сез for the total portfolio, in state N (ka) 


$ 
The Marginal VAR MVAR, at that CL is then the difference in VARs, viz 


$ О 
MVAR,, = d*CC? – d*C,, (26.23) 


Other VAR Work 


Jamshidian and Zhu formulated a scenario simulation with a discretization of 
multivariate distributions, for calculating VAR". See also Ch. 44. 
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27. Enhanced/Stressed VAR (Tech. Index 5/10) 


In this chapter, we discuss various stages of refinements of the plain-vanilla VAR 
discussed in Ch. 26. Increasingly realistic aspects will be included, with the final 
aim to obtain a risk measure that is more useful in active risk management. The 
first set of improvements give what is termed in this book “Improved Plain 
Vanilla VAR” (IPV-VAR). We then list further improvements to produce 
“Stressed VAR” (S-VAR) and finally “Enhanced/Stressed VAR" (ES-VAR)'?' . 
We close with some miscellaneous topics including subadditivity issues, and also 
an integrated form of VAR for which Expected Shortfall is a special case. 


Improved Plain-Vanilla VAR (IPV-VAR) 


The following table summarizes the next stage, including refinements past the 
PV-VAR to obtain the /PV-VAR, or Improved Plain-Vanilla VAR. These 
refinements are often included in current implementations of VAR. 


Quantity Compared Plain Vanilla VAR | Improved PV VAR 


Convexity Not Included Included via Grid 
Time Scale dt Uniform (10 days) Variable (liquidity) 
Cutoffs for d,x, Not included Included (Judgment) 


Time Period: x, Data Recent (1 to 3 yrs) Recent or Variable 


We describe these improvements in the IPV-VAR one at a time. 


' History: I developed some of these improvements to VAR between 1999-2002. Most 
of Improved Plain-Vanilla VAR is now routinely included in VAR analyses. The various 
Enhanced and Stressed refinements are less common. 


2 The Story of Stressed VAR and “Turbulent VAR”: I believe that I invented the term 
“Stressed VAR". Long afterwards, regulators decided to use the same name “Stressed 
VAR" for a simple form of my Stressed VAR, and which I used to call “Turbulent VAR”. 
See later for the details. 


I guess in hindsight all of this is amusing. What do you think? 
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Convexity and the Grid 


Convexity exists in all option products, and even to some extent in discount 
factors. Convexity effects can be included in a VAR calculation if a grid of 


exposures is available. A given variable x, is changed by discrete amounts to 


(Grid) _ (Grid) ; А 
ү, for example x =x, tk,"’o,, with various 


a 


values on a grid, a 


a 


values for (9% 


a 


. Before starting the VAR calculation, the portfolio is revalued 
at each grid point. 

The idea is that, for a given d,x, arising from a given throw of the dice, we 
pick out the appropriate exposure on the grid including convexity. We then get 
the MC state N (ko) at the CL with Кү, SD for the total VAR including this 


convexity. We pick the state at the CL desired to report the total VAR. 
Procedurally, a lookup interpolation table for the grid can be established. 
Another and simpler possibility is just to choose the exposure at a 
conservative level of, say, two SD. 


Example for the Grid 
For example, suppose that we have the DV 01 of a portfolio at five grid points: 
the current interest rate к, 7, + 50bp, and r, +100bp, obtained by direct 


revaluation. Suppose a simulated return is dr, ж 50bp . We use the calculated 
DVO1 at т, + 5Obp , the closest grid point. We continue the simulation. We then 
get the state iN (ko ) corresponding to the 99% CL for the total VAR. 


This procedure clearly includes convexity effects, since the DV01 chosen 
depends on the rate level, for this example 7, + 50bp . 


Caveats 


The procedure involving one-dimensional grids naturally does not include cross- 
convexity terms between different variables. 

Considerable effort generally has to be expended in order to generate the grid 
in the first place. The lucky situation will be if the front-office risk systems 
already generate the grid. Otherwise, things become murky’. 


> Murky: This is a technical term, possibly implying a long interaction to get (or not get) 
the grid. In addition, note that a real grid is not the same as a useless grid with the change 
in portfolio value just scaled up arithmetically, thus ignoring the convexity and missing 
the whole point. You might want to make that clear in the discussions. 
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Time Scale dt , Liquidity, and Product Types 


Besides the daily dt = 1 day, common assumptions are df = 10 (business) days 
or df = 1 quarter. For the Gaussian assumptions behind the VAR, the translation 


between the various assumptions is just made using «df scaling. However, real 
losses in real firms with real traders and real strategies may have nothing to do 


with an assumed Jdt scaling of a frozen portfolio. 


It is a good idea to step back to see what the parameter dt is supposed to 
represent. We can think of dt in two distinct ways: 


e Assumption #1. The time dt is a market perturbation time over which an 
“unusual” large disruption occurs, after which the market returns to a 
“normal” state. 

e Assumption #2. The time dt a liquidity time, i.e. the time it takes traders to 
sell or hedge the risky exposures in a generally turbulent market. 


Assumption #1: dt = market perturbation time. The problem with this 
assumption is that bad perturbations across markets can occur for times greater 
than, e.g. 10 days. Markets that become roiled can stay turbulent and volatile for 
a long time. The relaxation time is rather long to arrive at a calm state from a 
market that has made a phase transition into a panic-driven state, where clever 
trading strategies collapse, mean reversion becomes a fantasy, and investors jump 
en masse onto the flight-to-quality bandwagon. 

Assumption #2: dt = liquidity time. If dt represents the time it takes to sell 
or hedge the risk, there is no single number that corresponds to such action. 
Government bonds, short-dated plain-vanilla swaps, FX forwards and other such 
liquid instruments have a short liquidity df . On the other hand, illiquid securities 
with limited transaction volume have a long liquidity time dt even in a calm 
market, and arguably an even longer df in a turbulent market. 

Therefore, denoting product type by the label 5 , we have to include the 3 
(3) 


several different product types with different liquidities. For example, Libor can 
contribute two-year IMM swaps or to index-amortizing swaps, with different 
liquidities. Hence, to be accurate we should define an exposure corresponding to 


dependence as 21° . Note that the same underlying d,x, may contribute to 


a 


a given product type, ut (3) with ` é, = У: se) being the total exposure for 
3 


the underlying d,x, . 
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We can then include liquidity square-root-time effects by defining effective 


exposures by product and underlying! as e BS se) di? , while 


dropping the now irrelevant overall factor 4/df . 
The main problem with this assumption is that there is no guarantee that the 


risk will actually be hedged or eliminated in the time ai . As we discuss below, 
there are decision times to act during which losses can accumulate, and there are 
sound business reasons why the risky inventory may not be eliminated or hedged. 
(3) 


Moreover, there is the underlying problem of defining the value of df" in a 


turbulent environment’. 


Cutoffs for Underlying Variable Moves 


The Monte-Carlo simulation is rather stupid in at least two ways. First, a 
volatility input derived from data may have an "unreasonable" value, and the 
simulator will use any input vol. Second, the normal constraints for spreads, for 


example, may be “unreasonably” violated by the MC state N(K.,) generating 


the total VAR. 

The volatility can be “unreasonable” for several reasons. First, there can be 
bad data points in the time series. Second, the time period of the data series may 
be over a particularly violent period or a particularly calm period (see below for 
further discussion) In such cases, modification of the volatility may be 
desirable. 

Constraints occur among spreads between different markets, involving credit, 
liquidity, and other factors. While the details are complicated, we naturally want 
obvious constraints maintained, for example high-yield or emerging markets 
spreads being larger than high-grade corporate spreads. However, the MC 
simulator only knows about the underlying variable statistics, the multivariate 
Gaussian pdf, the exposures, the liquidity time(s), etc. Therefore, the MC state 


N(K., ) may have violations of these obvious constraints. For example, if there 


is a large short exposure to a low-credit rate, the simulator will perversely find 
the state of loss where this rate rallies, and for a high volatility typical of low- 


^ Where to Put the Time Square Roots? We could also multiply the sqrt[dt?] factors 
into volatilities, but then the underlying simulation itself would depend on the product 
and become unwieldy. 


5 What Liquidity Time? Traders may want to associate dt with the normal liquidity 
time for normal business operations in normal markets. This may have nothing to do with 
the abnormal liquidity time for abnormal business operations in abnormal markets. 


* What Volatility Modification? There is naturally no unique answer. One procedure 
would involve a collective discussion with Risk Managers and Traders. 
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credit rates, this rate may rally lower than the high-grade corporate rate in the 
MC state N (kor ). 


For this reason, cutoff logic on the moves d,x, , while messy to implement, 


ta? 


can be desirable. 


Regulatory Stressed VAR, Turbulent VAR, VAR Uncertainty 


A particular time period may particularly violent or calm. This can be used to 
define a VAR uncertainty. The idea is simple. Different time periods (AT, A 
with different market environments are defined, labeled by A. Then, the VAR 


for each environment is calculated. In this way, the VAR uncertainty from these 
different VAR results is exhibited. In particular we have 


e NORMAL PERIOD VAR: A normal calm period can be used to get 
to get the standard VAR that we can call VAR cam - 


e REGULATORY STRESSED VAR = TURBULENT VAR: Long 
before 2008, I suggested using a turbulent period to get what I used 
to call VAR The same definition was subsequently used much 


later for a regulatory definition of Stressed VAR’. 


Turbulent * 


We stated at the beginning of this book that it would be highly desirable to 
have a handle on the uncertainty in the risk measures themselves. The uncertainty 
in assumptions itself poses a risk, and this assumption risk is not included in the 
risk calculated under a given set of assumptions. The calculation of the 
uncertainty of the VAR is an example that should be more widely used’. 


7 Story — the VAR Zoo: Hey, I like this story so much I am going to give details. Before 
I started developing Stressed VAR (the original definition of these words is contained in 
this book), and long before the 2008 crisis, I proposed calculating VAR in different time 
periods to get what I called Normal VAR and Turbulent VAR. Nobody wanted to do this. 
The management response was “Jan, you only get one VAR. Which one do you want?” 

Eventually in Basel 2.5 the regulators specified using what called Turbulent VAR as 
the regulatory Stressed VAR. Again, regulatory Stressed VAR is NOT my original 
definition of Stressed VAR, just a primitive special case. 


* Other Reasons for VAR Uncertainty: These include data problems, for example short 
time series that must be proxied by longer time series that are not strictly speaking 
equivalent. Since this can be done in different ways, an uncertainty in the VAR exists. 
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Enhanced/Stressed VAR (ES-VAR) 


The ES-VAR, or Enhanced/Stressed VAR, is the most refined version of VAR 
that we shall consider. The “Stressed” attribute means that risks further out in the 
tails will be considered. The “Enhanced” attribute means that other attributes 
lending a more realistic aspect to the VAR will be included. 

The ES-VAR includes the refinements of the IPV-VAR given above, plus 
additional items that are summarized in the following table: 


Quantity Compared Improved PV VAR | Enhanced/Stressed VAR 
VAR Confidence Level 99% High, e.g. 99.97% 

Volatility Input Standard Deviations | Fat-tail Vols 

Correlation Input Usual Definition Stressed Correlations 
Idiosyncratic (no data) Not Included Estimated (Judgment) 
Liquidity Penalty Zero Nonzero for Hostile Market 
T(Start) Expos. Reduction | Start of Risk Period | Any Time in Risk Period 
Starting Expos. Level At Frozen Time Adjust to Expected Level 
Ending Exposure Level Zero Nonzero (Judgment) 

Number of Time Steps One Step Several Steps, Composite Vol 


We discuss these enhancements one at a time. 


Higher VAR CLs for Stressed VAR and Economic Capital 


The first improvement to get the stressed VAR is straightforward, and just 
involves raising the CL from the canonical 99% level. We will discuss Economic 
Capital (EC) in Ch. 39. For the moment, we merely note that EC is generally 
defined at a very high CL, for example 99.97%. The VAR can be run at this high 
CL, naturally if enough MC events are generated. For example, if 30,000 events 
are generated, we take the ninth worst loss’. 

The CVAR uncertainties can still obtained using the ergodic trick for the 


states {N (ker )} surrounding the 99.97% CL state N (ko; ) with ko, =3.43. 


For 30,000 events, for example, we can use 15 states, including seven just above 
and seven just below the 99.97% CL state. 


? Economic-Capital and VAR scaling-factors: Often Economic Capital is defined for 
market risk using a scaling of standard 99% CL-VAR with a numerical *scaling-factor", 
usually taken as 3 or 4. In our opinion there is little justification for any such specific a- 
priori assumption. A better approach is to deal head-on with the issues, which is what we 
do in this book. After the dust settles, a "scaling-factor" could be defined by the ratio of 
the calculated Economic Capital to the standard VAR. 


Chapter 27: Improved/Enhanced/Stressed VAR 391 


Fat-Tail Vols for Stressed VAR 


We have discussed fat-tail (FT) vols in Ch. 21. Here we note an important 
practical consistency of the MC simulation of the Stressed VAR with the 
definition of the FT vols. The FT vol assumption involves fitting the tail of the 


(ах, (t )| e.g. histogram at the 99% CL to obtain of" = dx," /2.33. 


а 


The consistency with the Stressed VAR for diversified portfolios is that the 


values of the underlying moves (d,x, oe producing the most risk are 


observed in practice to be around and rarely above the 99% CL. Therefore use of 
the FT vols in the MC simulator using a 99% CL is a consistent assumption ^. 


Stressed Correlations for Stressed VAR 


In previous chapters, we discussed stressed correlations at great length. Here we 


( Stressed ) 


first note that the use of stressed correlations p,p in practice generally 


results in the increase in the risk, relative to the use of unstressed correlations. 
Hence, we at least want to use some scenario for stressed correlations. 
Ultimately, the use of stochastic correlation matrices would provide an even 


richer set of states. The idea would be to construct a set { poe | of stressed 


( Stressed ) 


Matrix y Would be 


positive-definite matrices. Then a randomly selected matrix p 


used for each state of the set fd xa}, and the 99.97% CL state from all such 
states would then be chosen for the total Stressed VAR!!. 


Idiosyncratic Risk Inclusion for Enhanced VAR 


By “idiosyncratic risk” is meant a risk that is not included in the VAR as defined 
above. A short and incomplete list of examples includes: 


Illiquidity risks of low-credit bonds 

Risks of one-off options with a one-way market 
Various types of basis risks for spreads or volatilities 
Volatility skew effects 

e Exposures lasting only a short time and not captured at the frozen time t, 


rozen 


1 Fat-Tail Vols for All Variables vs. Stochastic Volatility: Using the FT vols for all 
moves will overestimate the risks of the less risky underlyings — but since by definition 
these are less risky, not much error is produced in the high CL Stressed VAR. A better 
assumption could be to use a stochastic volatility fitting the fat tails, although this is 
difficult to implement. 


'' Stressed Stochastic Correlation Matrices: Such a model that maintains positive 
definiteness was discussed in Ch. 23. 
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e Specific risks in the zoo of mortgage products 
e Unusual political uncertainty effects in emerging markets 
e Anomalous yield-curve shape-change effects 


The estimation and approximate quantification of such risks involves 
advanced risk management. Analysis requires microscopic and deep knowledge 
of individual markets. Inclusion of idiosyncratic risk into the VAR first involves 
a judgment call for the stand-alone magnitude of each such risk. 


The correlations between different idiosyncratic risks О м, and between 
idiosyncratic risks and normal risks Pio Normai need to be specified. One simple 
assumption is to use one value ру, 4n for all such correlations, and then take 


the maximum value e апу Consistent with positive definiteness for the total 


correlation matrix. One way to do this is to get the stressed correlation matrix for 
the normal variables, add in О, ,,, for all idiosyncratic-risk correlations, and 


Max 
Idio, Any 


then increase О 4n, to get the biggest possible result o still keeping the 


total correlation matrix positive definite". As a rule of thumb, an idiosyncratic 
correlation of around 0.3 with all other variables is a reasonable approximation in 
practice. Compared to the histogram of normal correlations, even stressed, this is 
a rather large and therefore conservative correlation. 

The VAR is then run with the idiosyncratic risks added in deterministically as 
above, and with the MC simulation generating the normal risks as usual. The 
state with the desired CL for the total VAR (including the idiosyncratic risk) is 
then picked out. 

The CVARs for the idiosyncratic risks can be defined using quadratic sum 
approximations, as we show below. 


Illiquidity Penalty for Enhanced VAR 


The reduction of exposure over the period ai for product types J іп a 
turbulent market environment usually involves additional losses for spread 
products and any other product with illiquid aspects. There are two reasons for 
these additional losses. The first is the inevitable flight to quality reducing 
demand for illiquid 3. There is also a magnification effect since many firms will 


7 Positive-Definite Total Correlation Matrix: If the positive-definite condition is 
violated, you can get the total risk calculated as larger than the sum of the stand-alone 
normal risk and the stand-alone idiosyncratic risk. This makes no sense, violating the 
necessary conditions of real azimuthal angles in the geometric construction of Ch. 22. It 
also violates the condition of *subadditivity", discussed below. 
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be trying to reduce the same 5 exposures at the same time under the 
circumstances, thus reducing secondary trading possibilities”. 


~ 


For this reason, a liquidity penalty AQ depending on product type 3 and/or 


underlying d,x, can be introduced. Thus, we increase the risk due to the 


exposure by the appropriate extra penalty. Depending on preference, the liquidity 
penalty can be expressed as a $ loss, additional spread bp, percentage change, etc. 


If AQ is defined as a percentage change, then logically this quantity can be 
added onto the exposure by replacing 


e 4 зА) (27.1) 

We will shortly distinguish exposures that are sold in the liquidity period and 
those that are not sold. The liquidity penalty then will apply to the exposures that 
are sold, since these correspond to realized losses. 


Starting Time for Exposure Reduction, Asteroids, Decisions 


We interpreted the liquidity time ai as the time needed for exposure reduction 
of product type S . However, we have been vague about the details regarding the 
actual sequence of events leading to exposure reduction. To illustrate, suppose 
that we are calculating the VAR for one quarter (3 months). Call this one-quarter 
time period the “Risk Period" AT, 


RiskPeriod * 


Now a specific liquidity time interval atr may be only, say, 5 days. We 
have effectively assumed that the start of this 5-day period (and every other 
liquidity time interval) occurs at the beginning of the risk period. Effectively we 
have assumed that the turbulent bad market condition—call it an “asteroid”— 
starts at the beginning of the risk period. 

Essentially this assumes that a “red flag" goes up for all desks simultaneously 
and that all desks start reducing exposures simultaneously at the beginning of the 
risk period. 

In reality, there is no reason that simultaneous risk reduction should occur. 
There are at least two good reasons why the liquidity time intervals can start at 


? Death of a Strategy: Here is a rerun of a story. Once, a swaps desk had the idea of 
using mortgage derivatives in a certain strategy. A sudden adverse change of interest rates 
led to decisions to sell these derivatives by every broker-dealer on the Street, with no 
buyers at model prices. The sale price was so low that the effective prepayments implied 
by the models were astronomically higher than historical prepayments. Thus, there was 
illiquidity at usual price levels, and a huge liquidation penalty. 
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various non-simultaneous times inside the risk period rather than at the 
beginning: 


1. Bad market events happen at different times in different markets. A quick 
look at data will confirm this assertion. 


(3) 


Decision 


2. A non-zero “decision” or reaction time T to a bad market will occur 


in order to decide to reverse the current strategy, which led a desk to take on the 


exposure for the product I in the first place. The desk may want to “wait for 
awhile to see what happens", for example. 


A picture of the idea is shown below with an hypothesized decision time to 
sell of 1 month and a liquidity time of 5 days, all within a 3-month risk period: 


Example: Time Scales for VAR, including the 


Decision Time to React to Bad Market Event 


Total Risk Period (1 Quarter) ———— 
Liquidity 


in 5 davs 


“Asteroid “ Turbulent Market in Decision 
hits here and Liquidity time periods 


The non-simultaneity of stressed conditions in different markets may or may 
not increase the risk, depending on the assumptions. If it turns out that some 


markets will not get stressed within АГ, the risk is lowered. On the other 


hand, if risk builds up and then an asteroid hits within AT, 


RiskPeriod » then the risk is 


increased. 
Because risk can accumulate from the beginning of the risk period to the 


starting time of exposure reduction, the liquidity interval di? does not give an 
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accurate description of the total risk. Consideration of the decision time in 
general increases the risk. 
Effectively, the liquidity time interval that should be used is the /onger time 


Total» Including the decision time, viz 


e 


di) = at) +Q 


Total Decision 


(27.2) 


(3) 


Decision 


(assuming that the decision to defease does occur sometime in the risk period) 
would be г) = AT riskPeriod /2 ‚ SO ac = di^ + AT piskPeriod /2 Р 


Decision Total 


The value of 7 depends on the strategy and desk. The default value 


Starting Exposure Level; Corrections to the Expected Level 
The starting exposure levels {°6,} before exposure reduction are taken as the 


exposures at the frozen time t, 


rozen * 


However, these exposures may not be 
: $ (Expected ) . . 
representative of the exposure levels "6; expected during the next risk 


period АТ» еа. For this reason, it may be reasonable to correct the exposures 


input into the MC simulation to these expected levels. The expected levels would 
generally require knowledge of desk strategy, or data of historical exposures that 
could be used for the expected exposure, etc. 

In Ch. 40, a procedure is described to correct risk for unused exposures 
relative to their limits. The first step in this procedure is to perform exactly the 
above correction to expected exposure levels. However, it would be more 
accurate to perform the correction in the VAR. This is because the state 


N (ko ) for the VAR is itself dependent on the exposures assumed. 


The Ending Exposure Level is Not Zero 
We have assumed that, following the decision time, the starting exposures for 


product 5 are reduced to zero in liquidity time interval dt”), This is by no 
means a realistic assumption. For example, a high-yield bond desk would not 
want to sell off its entire inventory even in a turbulent market, nor could the desk 
hedge the entire high-yield spread risk. The desk may have a profitable customer- 
flow business, and also may need to make markets. That is, for business reasons, 
the ending exposure may not be zero after exposure reduction. The fraction of the 


(4) 


exposure sold fẹ ` is therefore a parameter. However, the remainder of the 
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exposure (unsold) has risk that accumulates throughout the entire risk period 
AT, 


RiskPeriod * 


Three-Component Model for Risk Exposures 
A “Noise + View + Hold” risk model for exposures is a reasonable framework to 
begin to model a more realistic description of the different types of exposure. 

A picture of the idea follows: 


Exposure Components for Risk Calculations 


—e—E(Noise) —4— E(View) —m— E(Hold) 


100 


% Exposure 


*- O O со N “= O Q соу N “= O соб QO N 
v TN NN YOY MO st ч TFT WO шю 


Time in Risk Period (1 Quarter) 


3 


The noise component of a given exposure, le 4 , would represent the 
Noise 


part of the exposure due to day-to-day operations, customer transactions, normal 
hedging activities, etc. The noise component would fluctuate as a function of 


time around a mean (e.g. zero) with some characteristic time 7,,,,,,. There would 


be no decision time for this exposure. For purposes of risk calculations, the noise 
exposure component would be taken from its frozen level and dropped to zero in 


time Тш. This is because 7,,,,is the representative time that the exposure 


noise would fluctuate to zero anyway. Properties of the noise component could 
be determined through historical statistics of the exposure. 


The view component id could represent an exposure held because 
View 


of a certain strategy of the desk (e.g. yield curve steepening, currency weakening, 
commodity forecasting...). It could represent an expiring option, maturing 
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bond...whose exposure disappears and which the desk does not plan to roll over 
into a similar instrument. The view component could also represent a hedging 


strategy over limited time. The view exposure would be expendable in time ту, 
(including a decision time) if the market turned sufficiently against the strategy. 


The hold component lod would represent the core component that 
Hold 


would be held regardless of the market just in order to stay in business or for 
some other reason. It would be held for the entire risk period АТ» на. Each of 


these exposure components produces risk over a different period т, . Therefore, 


each exposure would be associated with its own 4/7. scaling factor. 


Exposure: Reduction vs. Double-Up in Turbulent Markets 


We have been assuming that at the end of the liquidity period, the exposure has 
been reduced from its current value. In real life, even in a turbulent market 
environment, traders may see opportunity and want to increase the exposure, or 
"double up". The trade offs have to do with buying cheap versus watching the 
market decline even further, and the tolerance of management for losses. 

We discuss the problem of additional risk for unused limits in Ch. 40. There, 
a model for exposure time dependence is postulated. This time dependence is 
supposed to occur from the current time up to the time that the turbulent market 
starts. At that point, the reduction or double-up behaviors as a reaction to the 
turbulent market begins. Since up to now we have only used a one time-step 
simulator, all these features become overlaid. We now turn to some comments 
regarding the extension of VAR to a multi-step simulator. 


Increasing the Number of Time Steps for VAR-Type Simulation 


We have introduced some explicit time dependence in the Enhanced VAR 
through the parameters defined above: ac А г) ‚ AT sapo): Cte. Still, the 
MC simulator used so far is effectively a one-step simulation. In principle, we 
can use a multi-step MC simulation with intermediate time assumptions related to 
all the above points. Such a simulation would require a high level of risk 
management sophistication. It would also be extremely costly, numerically 
intensive, and require more assumptions. Still, such a tool would provide the 
most realistic assessment of risk possible’. 

In the next chapter, we present a summary of the extension to multiple time 
steps using path integral techniques. 


" Color Movie for a Multi-Time-Step Risk Simulator? As mentioned at the beginning 
of the book, I have been waiting for over 15 years for such a sophisticated risk 
simulation, in color, as a movie. Maybe it will come soon. Then again, ... 
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As a partial improvement along these lines without involving the complexity 
of the explicit treatment of multiple steps, composite volatilities can be defined 
reflecting different events in at intermediate times. For example, we can picture 
two time periods during which the diffusion occurs with two different fat-tail 
(1) 


a 
component fluctuates to zero and the sale of the view exposure component is 
achieved. In the second period of time AZ), the hold exposure component 


continues to the end of the risk period. An effective volatility olf 


a 


volatilities с, and c. Over the first period of time AT, , the noise exposure 


| taking all 


this into account would be defined through the variance equation: 


2 2 1/2 
е ey N AT niskPeriod = | : EM 80 АТ | + E on 6.) N AT, | | 


(27.3) 
Subadditivity, Integrated VAR, Backtesting 


VAR and Subadditivity 


Heath and colleagues" have pointed out that under some conditions a 
subadditivity property can be violated. Basically, subadditivity asserts the 
condition that the sum of the risks of stand-alone portfolios should be greater 
than the risk of the composite portfolio with all securities in all portfolios. This is 
physically reasonable. Physically, diversification and risk cancellation can occur 
in the composite portfolio, but risk enhancement theoretically should not occur. 
Equivalently, the effective correlations between the risks of the individual 


j^? 


As a corollary, if the potential loss is doubled by doubling the amount of a 
security, the risk measure of this loss should linearly double’. 


portfolios must have magnitudes <i. 


'S Risk Enhancement — Other Aspects: Actually this academic statement neglects the 
issue of volume effects that can non-linearly increase risk. If a desk owns a substantial 
fraction of the total issue of some security, there can be severe liquidity problems if the 
desk decides (or is forced) to sell. A portfolio with twice as much of a security can have 
a real risk of far greater than twice as much. The collapse of LTCM and various Arb 
desks in 1998 was in large measure due to exactly this problem. 
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Heath’s Example of Subadditivity Problem 
Convexity effects can cause subadditivity violations. Heath’s example involves a 


short digital call Cg, „с and a short digital put Сор. just before expiration. 


Individually, at a given CL, the risks may be zero. Yet, the composite risk of the 
total portfolio Cac pu at the same CL may be nonzero, violating 


subadditivity. 
Consider a MC simulation. A fraction f p Of paths arriving above the call 


(77 


strike produces a loss, but if CL «1— f,» the risk of С, „сар at that CL is zero. 
Similarly, the risk of C. 


Cspot Pur 15 Zero if CL<1—/f, for paths arriving below 


the put strike. However, in the composite portfolio, the total fraction of paths 
causing а loss is. f, = fi, + Suown Hf CL» 1— fior» the loss of Caci put is 


tot 
not zero. The presence of both sets of paths and the discrete nature of the risk 
causes the problem in this example. 
One resolution of this problem suggested by Heath is to calculate risk from a 
limited set of paths using scenarios. 
For market risk of desks involving bonds that have small convexity (most of 
the risk of a bank), subadditivity does not really pose a practical problem. 


Practical Resolution of Stressed VAR with Subadditivity Consistency 


Subadditivity is satisfied for Stressed VAR with Gaussian Monte Carlo, 
including fat-tail vols and stressed correlations '° . 

The proof of subadditivity here rests on the existence of the positive-definite 
correlation matrix between underlyings. If idiosyncratic risk is included, the 
expanded correlation matrix involving these risks must also be positive definite. 


Confidence-Level Integrated VAR Measures 


We have been discussing risk at a given confidence level. Even the ergodic trick 
required the approximate transformation of states with different CLs into a given 


CL. Here we consider a different idea, an integrated ‘VAR, calculated as an 


avg ? 


average between two confidence levels CL,,,, and CL,,,,. Such an integrated 


min 
risk measure has smoother properties than risk measured at an isolated CL. 
Basically, integrals always smooth out functions. As in the ergodic trick, 
information from more paths is sampled in the averaging process. 


We shall meet the CL-integrated risk idea again when we consider issuer risk. 


16 Convexity Grid and Subadditivity Consistency for Stressed VAR: For strict 
consistency we would need convexity approximated by using a fixed “scenario” value for 
delta from a grid. This is reasonable for most risk, although not for equity derivatives. 
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Example of Average CL-Integrated VAR 
To illustrate the idea, consider the simple quadratic VAR. The integrated average 
"VAR, between CL... (at k SD) and CL. (at k SD) is obtained 


min CL min max CL max 


[74 t 
а=1 


by calculating "VAR, = » Se ud 3 . The result is the same as 
(Ker min Ker ma: ) 


“VAR if Кү, is replaced by К = ç (ker mins Kcrmax ) - This Kc is the effective 


number of SD, and is given by" 


1 | exp(- PE exp(- ‚ ЖЖ | 


ke = Е ынды] = = (27.4) 
V2a [N (kernas) -N (kornin) 
Here N (-) is the usual normal integral. We get the result 
! УАК „= ~ а (ko min ? key max ) Ма S VAR“ (27.5) 


We can consider, e.g. the VAR ve at CL = 99% , as an average between 
CL 


min ? ‘max * 


In the limit CL,, = Cling, = С, we get (Ker а Кома) > Ker » 


appropriately chosen CL 


min 


restoring the usual formalism at a single confidence level. 


Expected Shortfall (ES) or Conditional VAR 
Another possibility has CL,,, =100% with К, = о. In that case, the 


max 


SVAR, risk is averaged over all paths in the tail past CL 


де and so we get all 


the outliers. Note ket > Ker min Ehe resulting integrated VAR is called Expected 
Shortfall (ES) or Conditional VAR (confusingly also called CVAR'$). Other 


names for ES are “Mean Excess Loss”, “Mean Shortfall”, and “Tail VAR? " , 
Regulatory requirements are increasingly focused on ES / Conditional VAR. 


17 Homework: Read the next chapter, then come back and do this as an exercise. The 
appropriate expected value averaged between the two confidence levels is normalized by 
the denominator, which is the integrated measure between the two confidence levels. 


18 CVAR = Component VAR in this book: Please note that CVAR in this book means 
Component VAR, not Conditional VAR. 
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For ES at С, =99% (А, у = 2.33), К (the effective number of SD) 


(99%ES) _ 


ir BG 285 


996 CL 


— = оо). Since Es is the inverse normal of 


99%ES) 


CL, N (ka) = CL. Numerically, d — 2.67 . This is equivalent to 99.6% 


VAR. Also, 97.5% ES with ç 2.34 is close to 99% VAR with 


koon, с, = 2.33. All this assumes Gaussian statistics. 


Stressed Expected Shortfall (Stressed ES), or Stressed Conditional VAR 


Stressed expected shortfall can be defined with the same Gaussian formalism 
with stressed input parameters (fat tail vols, stressed correlations). For SVAR at 


99.97% CL with k = 3.43, Q9 55 — 3.69, or SVAR near 99.99% CL. 


99.97% CL 


Historical ES and Historical Stressed ES 


With the historical VAR approach, ES (or Stressed ES) is defined as the ES with 
input returns from the recent (or some stressed) historical period. This does not 
assume Gaussian statistics. 


General Measure Orthogonal Polynomials and Tail Risk, Generalizing ES 


In Ch. 36 we present general measure orthogonal polynomials? that may be 
useful as refined probes to decompose tail risk, potentially generalizing ES. 


Backtesting VAR, Backtesting Problems with Average CL-Integrated VAR 


In Ch. 26 we saw that backtesting VAR at a given CL (e.g. 99%) for a portfolio 
means that the P&L for the portfolio is compared to the VAR on the same 
portfolio calculated on the same day. Over one year (250 business days), if the 
VAR is capturing the portfolio risk, we expect that 0.01*250 exceptions (2 or 3) 
exists up to statistical uncertainty. 

For average CL-integrated VAR, backtesting is problematic. Since there is no fixed 
confidence level, we don't know how many exceptions to expect 2. 


? General Measure Orthogonal Polynomials: These depend on moments of a given 
positive distribution. First formulated by Chebyshev, I rediscovered them. 


? Backtesting is problematic for CL-Integrated VAR or ES: For linear portfolios the 
effective number of standard deviations in Eq. (27.4) does not depend on the portfolio, so 
backtesting at that fixed confidence level would make sense. However for nonlinear 
portfolios that change every day, an effective number of standard deviations would also 
change every day. Backtesting for nonlinear portfolios with an integrated VAR therefore 
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A simpler way of measuring tail risk with backtesting would be just to raise 
the VAR confidence level. 


“Bayesian/Scenario VAR” 


We can define a “Bayesian” VAR or “Scenario” VAR! by selecting a subset of 
states from a random Monte Carlo simulation (or just from history) that happen 
to have fixed “scenario” values of returns for some variables, up to some 
uncertainty. We can for example choose the subset of states using Scenario I in 
Chapter 4, where the general stock market is postulated to drop e.g. between 50% 
and 70%. Other states are excluded. The risks of all variables (Libor...) in this 
restricted set of states are examined, and the VAR is calculated in the usual way - 
e.g., the 10" worst portfolio loss in 1,000 such states, for a 99% CL. 

The advanced version of Stressed VAR allowing a scenario of stressed 
correlations and a definition of fat-tail volatilities is also an example. Indeed the 
idea of introducing scenarios in a mathematically consistent framework was one 
of the primary motivations for developing advanced Stressed VAR. 
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http://www. investopedia.com/terms/c/conditional_value_at_risk.asp 


makes no sense. There is no way to count the number of expected exceptions with a 
simple formula like “0.01*250” that exists for standard VAR (at the 99% CL for 250 
business days per year). 


*! Acknowledgement: I thank Evan Picoult for a discussion on this topic. 


28. VAR, CVAR, CVAR Volatility Formalism 
(Tech. Index 7/10) 


In this chapter, we present a formal functional derivation of the VAR and CVAR 
equations for the linear case. Again, CVAR = Component VAR. We pay 
particular attention to the CVAR volatility. The derivation is done for in the 
continuous multivariate framework. This shows that CVAR uncertainties are 
present in the limit of an infinite-length Monte-Carlo (MC) simulation run. We 
indicate extensions for non-linear exposures (convexity) to VAR, as discussed in 
the last chapter. We end with a summary of the extension to multiple time steps’. 


Set-up and Overview of the Formal VAR Results 


To perform the calculations, we need the multivariate Gaussian probability 
distribution for one time step. The time difference of an underlying variable x, is 


d,x, (t) =x, (t+dt)—x, (t) at fixed time f. This is the return if x, =Inr,. 
To apply the formalism to the Stressed VAR, we would specify the vol с, of 


d,x, as a fat-tail vol Оьр, 


as discussed in Ch. 21. We would also specify the 


( Stressed ) 


d,x,, d,x, correlation p,p as the stressed correlation p,p , as described 


ta? 


Ch. 23, 24. 
The probability is then an integral over all possible values of each d,x,, 


involving the measure d (d,x, ). This horrible notation just means that we 
actually integrate over values of x, (t + dt) at fixed x, (t) , using the measure 


dx, (t+ dt). 


For simplicity of notation in this chapter, we do not indicate factors of Vdt 
in the intermediate formulae explicitly. This can be corrected simply by inserting 


a factor Jdt for Cis $ CVAR& , and Sp ARQ А 


' History: I performed these calculations between 1998 and 2002. 
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The probability integral for calculating expectation values under the above 
assumptions is the usual multivariate Gaussian in the underlying internal indices 
for one time step? 


1 п 9 а(а,х,) 


ГР П) EM exp(—®[ {d,x, ) |) (28.1) 


Here, 


(28.2) 


With the exposure ub , the change in value of the portfolio due to the change 


d x, is just °& - dx, , by definition. For different values of d,x, we will get 


гга? 
different values of e Lx ss The total VAR will be the value of the sum of all 


such contributions, at the specified CL corresponding to К, standard deviations: 


SVAR = > S8, ал (28.3) 
gal 


ker, 


We next consider the CVARs (Component VARs). By definition, the 
contribution to the total VAR from *&, -d,x, is just С VAR, . On the average, 


© VAR, is the expectation value a dx) over the (2 distribution. We will 
show in fact that up to a factor, o6, dix, ) is just the quadratic form 


* CVAR2"" described in a previous section. 


2 
We will further show that (Г. @х„] | is nonzero. Therefore ? C VAR, 


has an uncertainty or volatility o( o VAR, ) , which we will calculate. We will 


also show that o ( *G VAR, ) has а nice geometrical interpretation. 


? Path Integral for multi-time-step VAR: The VAR we discuss here involves just one 
time step. Inclusion of risk for many time steps can be directly handled using path 
integrals, which begin by discretizing the time axis into many time steps. Hence, VAR 
can be directly generalized to many time steps in a direct fashion theoretically. At the end 
of this chapter, we discuss the matter a little further. The reader is also invited to read the 
detailed discussion of path integrals and finance in this book, especially Ch. 45. 
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Calculation of the Generating Function 
We now present the details. We define the generating function, depending on 


conjugate variables BA as 


n 


1 "„% dd.) 


elta.)]- ас П J ET ex |-o [4] Га, Jel] 


(28.4) 


We note that (o = el {J, = 0j] : 


We will need to get a variable that looks like the VAR. To this end, we 
introduce a “P+L” variable F and write the Dirac delta function constraint 
equation? 


јава Р-У, а, (28.5) 


We insert the factor 1 into the integrand of e|. | and rewrite 1 as 


[ағ -ó [ -5 is аә) We then use the Fourier representation of the 
a=l 


—00 


Dirac delta function’, 


d r- Y ах, | = | ew чөк cia ах, (28.6) 


a=l 


We can now do all the (dx, \ integrals. We have 


> $Units: The VAR-variable F has units $. The conjugate variables J, and Fourier 
variable о have units 1/$. This notation is not in the equations to avoid clutter. 


^ What, Again? The reader who has followed this book will notice a repetition of the 
same theme regarding the generating function or functional, introducing the constraints 
through Dirac delta functions, using the Fourier transform, and then doing the integrals. 


406 Quantitative Finance and Risk Management 


ral j A45 ав өара dx, orsa] x 


a=l 


2 а,В=1 
(28.7) 
Now from Ch. 26, we recall that 
$ C "4 рО" = "fo, 3 $ 
V. a > 5 уд раа Х.Р 6505 
SVAR” = Уу S CVARDU! (28.8) 
a=l 
With these substitutions, we find that the Fourier transform integral is 
© п 2 
[ AD ep a EA VARA YJ SEVAR?” |  (syagomry |= 
* AT AY 2 
= 1 exp| w(F;(,)| 
S VAR?" [ол 
(28.9) 
Here 
1 : | 
түк = ; L РАК Уставе (28.10) 
2(SVAR ) azi 
We also write ?* о» Sou о, Pig EBT в. The result for the generating function 
is then 
F-oo 
e[t.]» | а(Е)ехр[о(;(7,))| (28.11) 


F=-00 


Here the measure d (F ) in the VAR-variable F reads 
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_ dF en Е? 
S VAR9"" 27 2 ( зд рОчаа ll 


du(F) 


(28.12) 


Note that this Gaussian measure for F has volatility * VARS" . Therefore, at 
Ке, standard deviations in F , we have F = К, S VAR” 


The exponent о(ғ;{7,)) 15 


F * ua 
O(F;(J,]) = 5/4 рч VI ECVARE á 


а=1 


2 
1] Sssevanen | ы-у Je 
2 @=1 2 a p=l 


(28.13) 


This completes the calculation of the generating function (o I" | Р 


Calculation of the Exposure*Underlying Change Moments 
We take derivatives with respect to {Ja} and then set M = 0} to get the 
moments. 

Recall we set (d,x,)=0 because the VAR is supposed to represent 


fluctuations away from the average. Therefore, up to the first two moments, we 
have 


(56 -d,x,)= [де[{5„}]/84, |, _, (28.14) 


Cedr Gadz [Ep] /л,ал, |, (28.15) 


After a little algebra we find 


F 
[ao {y.} ]/a7, m cav ves OO (28.16) 


GORATZEA = Sa S CVARÜ" 5 CVAR9" (28.17) 
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Therefore 


F 


à К) Quad 
sone CHE (28.18) 


o6 ах, \ = T du(F) 


F 
Also, 
F=% 
C6-dx, 5 &.dx.)- | du(F): 
pis 
Е? 5 CVAR?"" 5 Суд ро 
| (Svar y 


+(* o : _$ СУАР“ E CVAR?"" ) 


(28.19) 


VAR, CVARs (= Component VARs), CVAR Volatilities 
The total ‘VAR (ke, , dt) is equal to VAR-variable / at a prescribed CL (e.g. 
99%) corresponding to Кү, standard deviations in F , and for a specific time 


F=% 
interval dt. Hence, we drop the integration 1 du (F ) and substitute 
F=-0 


Fz- ko, Vat 5 VAR92"" in the integrand (with the factor of Ма restored). So 
we get 


‘VAR (ke, dt) = ke, Уа: 5 РАВО“ (28.20) 


The CVAR is just the average risk for a specific underlying at К, standard 
deviations for the total VAR, that is 


*CVAR, = o6 dix, ) (28.21) 


ker, 


This is, restoring the Jat factor, d VAR, = ker Jat oC VARS“ А 


The second moment at К, standard deviations for the total VAR is 
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C&.dx *& dx.) , =" а?,. OVARO * CVARO — (28,22) 


7 M 
Here, the connected part of the two-point function at К, is defined as 


C6 dx, 5 б, ах). = a ux, 5 б, dX) 


(28.23) 
-(6 da, ies (4, С 


The Component VAR (CVAR) Volatility 
If we set y'=y in this equation, we get the square of the CVAR volatility 


$ 
Ocyam, › namely 


S ota, = (Г, dul). -[346, | - [5cvang" T (28.24) 
“С, 


Notice that the CVAR volatility is not zero. Therefore, there is an uncertainty 
in each CVAR. For the linear case, Eqn. (28.24) is the exact formula for the 


CVAR volatility. Also, note that the CVAR volatility is independent of ko; . This 


is because the second bracket in Eqn. (28.19) is independent of F . 
Using the “ergodic trick” of Ch. 26, the validity of the above formula can be 
(and has been) checked using Monte Carlo simulation for linear portfolios. 


The Total VAR has Zero Volatility 
Even though there is a volatility for each CVAR, there is no volatility at all for 
the total VAR, i.e. i Сүдр = 0. This is easily seen by summing over all (У,у) 


in the two-point function above to get 


r 2 
i Trin = У g, * d,x, 
=] 
zi oka, (28.25) 
2 
= [svare T zb ‘ставо | -0 


yal 
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Since the VAR volatility is zero, the VAR is exactly determined. This shows 
clearly that the CVAR volatility has nothing to do with Monte-Carlo statistical 
noise for a finite number of events, since all the above calculations have been 
done in the continuous limit corresponding to an infinite Monte-Carlo simulation. 


The CVAR Volatility Triangle 


From Eqn. (28.24) we have a nice geometrical right triangle involving the CVAR 
volatility, as shown in the picture below: 


The CVAR Volatility Triangle 


$Exposure * vol = "ec, 


СУАР 
CVAR volatility = * ссуд, 


The CVAR Volatilities for the Nonlinear Risk Case 


The existence of the CVAR volatility is not at all limited to the linear case. The 
uncertainty in the components of risk for a given total risk is a general concept. 

Moreover, it turns out that even for somewhat non-linear portfolios, the 
CVAR volatility formula (28.24) is a reasonably good approximation. This has 
been explicitly checked by Monte-Carlo generated CVAR volatility with 
convexity approximated by using a grid. Only for highly convex option portfolios 
does Eqn. (28.24) break down. 


CVAR and CVAR-Vol Risk Report, and a Related Portfolio Strategy 

The CVAR / CVAR-Vol risk report shows the component risks (the CVARs) of 
VAR, and their uncertainties (the CVAR Vols). The example shown below is 
generic with four risks; the total VAR = 240. Each “academic” CVAR, the 
average CVAR, is leftmost in each group of 3 rectangles. The CVAR points for 
+1SD in CVAR-Vol units are given by each error bar, exhibited by the other 


two rectangles, e.g. 46<CVAR, < 133. 
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A portfolio strategy could change exposures to take advantage of the CVAR 
uncertainties. Although СУАР А" = 90, risk #1 could actually be lower. Let 


exposure #1 be increased, keeping the new СУАР А" < 133. Changes in 


other exposures are needed to keep the same total VAR, and to ensure the new 
academic CVARs are within the CVAR error bars for the original exposures. The 
new portfolio would then have sensibly similar risk to the original portfolio. 

An optimization scheme could be used to maximize the return in this scheme. 


m CVARs gi Low CVARs p High CVARs 


"Error bar" = 
+- Vol(CVAR) 


Risk #2 Risk #3 


Effective Number of SD for Underlying Variables 


From the value of *CVAR, = o6, dx, К , we can back out an effective 
CL 


number of standard deviations pe ) for the variable d,x,. This is done simply 
using the (ergodic average) value of d,x, in the state giving the total VAR at 
Ке, SD. We write 


ken = (dx, a /е, (28.26) 


Because of diversification due to correlations, the effective number of standard 
deviations k\” is less than the VAR ka, ie. K^" <. For example 


consider К, =3.43SD for a Stressed VAR calculation at the 99.97% CL. 
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Suppose” Clee A =] and o, = 0.5. Then the number of standard deviations 


on average for this variable d,x, is к )? 22. Values of dd ? around 2 are 


attained by a few of the riskiest variables at the overall VAR 99.97% CL for 
diversified portfolios, as mentioned before. 


Fat-Tail Vol Consistency with the Effective Number of SD d ) 


There is a consistency check of the definition of fat-tail (FT) vols with the 
effective number of standard deviations. As discussed in Ch. 21, we defined FT 
vols at the 99% CL or 2.33 SD. This is in fact roughly consistent with the most 


important variables having ү ! ж 2. For the less important variables, the use of 


the FT vol will increase the risk somewhat, but since less important variables 
have less risk, the total risk is not much affected. 


Calculation of i ) in the Linear Risk Case 


Because we have the expression for the CVAR for the linear case, we can write 
down к“ ) We get 


ре? — 


CL E $ 
у 5 удро“ » 650 5 (28.27) 


To get an idea, we evaluate this in the case of equal exposures, equal vols, 
and constant correlation p. The correlation matrix is p, = ô p + Po (1 — Ор) А 


It is not hard to see that 
det p =(1-,)"[1+(n-1)p | (28.28) 


where so n is the number of variables. Hence, we need р, > -1/ (n -1) for a 


positive definite correlation matrix. Substituting, we find 


ken = key | %(1—1/п)+1/п | (28.29) 


? Time Units: If we are doing, say quarterly VAR calculations where dt — 65 days, then 
the volatility is a quarterly volatility, either scaled up by sqrt(65) from the daily vol or 
else defined using windows of 65 days. 
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k” ) as а real number. For 


We see that the positive-definite constraint keeps 
large n we get К ? ж ko, «| о, . Even for correlations around 0.25, which is on 
average is a large correlation, we see that p = Ко /2, which is still 


substantially less than ke; . If o, = 0 we get k” = ka / Vn. which goes to 0. 


Eigenvalues of Constant Correlation Matrix (one correlation) 
The eigenvalues A] of the constant n by correlation matrix p can be 


found using Eq. (28.28) with the replacement р, > P, / (1-4) to solve 
det(4I— p)=0. We find 4, =1+(n-1)p, and 2, =(1- p,) for à 2 2... 


Extension to Multiple Time Steps using Path Integrals 


Although we have no illusions about the practicality of performing a multiple 
step MC simulation with hundreds (if not thousands or even tens of thousands) of 
variables as needed for corporate-wide VAR, we can nonetheless easily extend 
the formalism using standard path integral techniques. We outline the linear case. 
The interested reader will have no difficulty filling in the steps. All the 
improvements to the VAR discussed in previous chapters can be implemented in 
principle. 

The time labels are /,m = 1...N , and we retain the internal index labels 


a, B,y =1...n. The ? £, exposure at t, is m od (t) and the x, time 
difference over dt, at t, is d, Xo, =d, Xy (t,) EX (t, * dt,)- x, (t,). Step 


by step, we proceed exactly as above. The probability density exponent becomes 


N n Р dx, 
9[(42,,]]- 72; Ё (ог) 2d (28.30) 


Sta oe о, А 


Here the volatilities contain the factor ,/dt, . The conjugate variables become 
{J ee The time-local CVAR variables also pick up the time index, 


e VAR; = C6 s dix ty) , Still at the specified overall CL for the total 


key 
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N 
VAR. The total VAR over the complete time interval T = Y dt y for Ka 
4 


N 


standard deviations is VAR — b У SCVAR d . 
ko 


Д=1 yz 
h L 


Repeating the algebra for the crossed second moment, we find that 


(6, dix, 56 ах 


my t” m;y' A 


= $$ 2 $ Quad $ Quad 
Хт О by T CVAR;, G CVAR iy 
(28.31) 


Notice that the subtracted term has two factors at different times t Tatae 

The local CVAR volatility at fixed time obtained by setting l = т and 
y = у' obeys the same triangle relation as found above for the single time step 
case. 


29. VAR and Component VAR for Two Variables 
(Tech. Index 5/10) 


The Component VAR (CVAR) Volatility with Two Variables 


Here, we restrict our attention to two variables. We begin with the CVAR 
volatility. Here is a picture of the geometry: 


Geometry for the CVAR vol, CVARs, and 


Exposures* Vols for n = 2 variables 


$ САКО“! 


The CVAR volatility’ turns out to be the same for both variables. Both 
triangles with САКО“, $ САКО“! have а common leg, the CVAR 


volatility , С сулк. Writing the correlation p,, = cos@, ће CVAR volatility is 


' Synopsis: For those of you who just tuned in, CVAR volatility measures the uncertainty 
in the contribution of risk of the corresponding variable to the total VAR. The 
415 
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ос 60, sind 
$ VARS” 


$ _ 
OcvaR = 


(29.1) 


Geometry, Math: Risk Ellipse, VAR Line, CVAR, CVAR Vol 


The following diagram gives the idea for the geometry. The details are below: 


Lin *&-dx, + ! &.dx, = VAR, 


Tangent point of line to 
ellipse gives nominal 
CVARs at 99% VAR 


E „з, ellipse at 
ka = 2.33 SD, with 
93% of events inside. 


CVAR volatility тск, 


around the nominal" CVAR, 


superscripts “Quad” indicate quadratic forms appropriate in the case of linear risk. For 
notational simplicity, dt = 1 here. See preceding VAR chapters for details. 
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The axes are for the two individual risks e -d,x, and в :d,x,. For 
illustration, we used а 99% CL. 


The VAR Line for Two Variables 

The line * é dx + : & d, X, = * VAR oo, is made up of points that, in a 
MC simulation, produce the value of the VAR at the 99% CL. This line defines a 
region to the left of the line containing 99% of probability, 1.е. 99% of the MC 
events. To see this, recall that the integrated probability depending on the P&L 
variable F to be less than some given value F,, is? 


Fa, 


dF F 
[к=к ]= | зудро S2 2(°улко« y 


(29.2) 


Setting Е, = ko, ^ ИАКО“, we get [Е < Fa] = N (ker ). Еог ехатр1е, 
we get the usual 0.99 for К, = 2.33. 
The quadratic form VAR is given by the usual 2-variable expression 


(VAR Ý (зао) -( 5o.) +2°§ 0,580,080 0293) 


Formalism for the Risk Ellipse Е in Two Variables 


The Risk Ellipse is an ellipse Æ whose boundary has a constant probability. At 
n = 2 , the total probability distribution is the Gaussian (c.f. Ch. 28) 


н Ценова) oa 


We will wind up naturally not integrating over the full range of the variables. 
Writing the exponent as Ф | (dix, | = R? /2 ‚апа setting 0, = соѕ0 we have 


2 Тһе VAR pdf: Again for those who skipped the formalism, under the linear risk 
assumption the possible values of the VAR are distributed as a Gaussian. The width of 
this Gaussian is just the quadratic-form VAR. To put in the dt dependence, the quadratic- 
form VAR is multiplied by sqrt(dt). 
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2 2 
В? sin’ 6 = e | + (2s | 2 dh dp cos Q (29.5) 


The surface traced out by the equation А = constant is an ellipse in the 
(d,x, d,x,) plane, and a different ellipse Е іп the (^4 I Es dix) 


plane. 


Relation of VAR and the CVARs to the Risk Ellipse 
The ellipse Е has the property that if R=k,,, then E is tangent to the VAR 


line, and this point is an extremal point for the risk on Е . Moreover, the CVARs 
at this point are exactly the nominal values. We show this next. 


We start by changing variables to elliptic co-ordinates, d,x, = o,Rcos 8, 
d,x, = o, Rcos( B — 0). The probability measure is just А exp(- i^ /2) ака В, 
independent of Ø. The total risk on the ellipse at fixed R is therefore 
"С. (А, В) =° &o,Rcos B + 60,6 cos( 8—0). We want the risk maximum 
on the ellipse by moving the angle Ø. Setting the derivative of Cy with 
respect to f. equal to zero at yax we get Go, cos б, = СУЛК“ and 
` 20, cos( Bra. — 0) =" СУАР“. Here the quadratic CVARs are the usual 


expressions (again, 0, = cos): 


! CVAR?"" = * &g, (*&o, +c0s6°6,0, VAR?" 
(29.6) 
 CVAR?"* = * &o, (о, + cos6*60,) ^ van?" 


$ 
We have VAR® = * CVAR& + * CVAR2" . If further we set А = Кү, 


we get °&-Фх|„ = Kg CVARR™ and *&.d,x|, = ke, CVAR?™. 


мах Вмах 
But these are just ће nominal CVARs at Кү, standard deviations for the total 
risk. The maximum total risk on the ellipse Е 15 therefore 
"С (ko ‚4 m = К, УАК" . and this is just the total VAR. 


Therefore, we have shown that the maximum risk on the ellipse is just the 
total VAR. On the other hand, the risk along the VAR line is also the total VAR. 
Therefore, the VAR line is tangent to the ellipse with А = К, at the point yax- 


This is shown by the figure above. As we go around the ellipse, the risk goes up 
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and down, reaching a maximum at Pya: Note that when Д = Д, +7 (on the 


opposite point of the ellipse), the sign of the risk changes and we make (instead 
of lose) the maximum amount of money on the ellipse’. 

It is worth noting that the percentage of events inside Е generated by a MC 
simulation is less than 99%, because 99% of the events lie to the left of the VAR 
line and some of these events lie outside E . The integrated probability 


o|R < Rs] inside E for RE Rya is 


Rmax zd 
e[R < &,.]- | exp(-R?/2)R ar: | 52 (29.7) 
0 Л 


0 


Taking Rya = Ке = 2.33, we get pP[R < Ka] =1 -exp(- kér /2) = 93% as 


the percentage of events lying inside E . This is why we notated the Risk Ellipse 
by Eou in the figure. 


The CVAR Volatility for Two Variables (bis) 
We can now get some more insight into the CVAR volatility. Walking along the 
99% CL VAR line away from the tangent point with E, , the total risk stays 


the same (equal to VAR). However, the projections on the axes change. Since 
these projections are just the CVARs, we see that the CVARs can change with 
the total risk being unchanged. 


We already know the form of the CVAR volatility ? Ссудв because we just 
calculated it. It turns out that if we look at a bigger risk ellipse E о, with 99% of 
the events, then the projections of the intersections of E ,,,, with the 99% VAR 
line are related to the CVAR volatility. In particular, if we set cosy = ke; / R 
where А now corresponds to Æ g , then the length 2 of the VAR line segment 


between its intersection points with Е, is given by & = 242 Rsin y ? Ссудв - 
This ends the discussion of VAR and CVAR in the two-variable case. 


> Notation and Signs: Note that by convention here, positive VAR means a loss. 
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30. Corporate-Level VAR (Tech. Index 3/10) 


In this chapter, we consider additional topics related to applications of VAR and 
CVAR for corporate-level risk management. We first discuss aggregation issues. 
We then discuss implied correlations between business unit P&Ls. We end with a 
consideration of aged inventory. 


Aggregation, Desks, Business Units, Corporate Hierarchy 


Corporate Structure and Practical Aggregation Difficulties 


Large banks and broker-dealers have a complex internal structure involving 
hundreds of products dealt with on many desks. The desks are arranged in a 
hierarchy into business units and/or divisions'". For corporate risk management 
over this entire structure, it is necessary to aggregate the risks of the individual 
components. A plethora of problems or difficulties can arise, both technical and 
non-technical. 

Some technical difficulties, not necessarily in order of importance and 
certainly not complete, include: 


Data for time series: Availability, consistency, completeness etc. 
Systems: Hundreds of feeds, thousands of variables, legacy issues etc. 
Risk measures: Availability, timeliness, consistency, completeness etc. 
Calculation: Level of sophistication, huge correlation matrices etc. 


We have spent a fair amount of time in this book discussing these technical issues 
in some detail. Other formidable difficulties are non-technical, including budgets, 
priorities, time limitations, personnel, communication, sociology, etc. Moreover 


н Example: А business unit or division can be fixed income, equities, etc. The fixed 
income business unit has different desks, e.g. U.S. mortgages, corporate bonds, etc. The 
mortgage desk has a substructure of desks trading different mortgage products. The 
nomenclature and hierarchical details depend on the institution. Businesses of a 
completely different character within the corporation can include insurance, commercial 
credit cards, etc. A corporate hierarchy of divisions and subsidiaries exists for a large 
international bank (ref). 
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in this age of acquisitions and mergers, the corporate structure can change’, 
requiring flexibility. Regulator requirements also exist that exert pressure. The 
bottom line is that these real-life issues can make corporate-level risk aggregation 
a gigantic, long, painful effort. 


VAR Aggregation 


Conceptually there is no difficulty in writing down VAR aggregation. In Ch. 27, 
we indicated the need to distinguish different products labeled by the symbol 3. 
We wrote the exposure for a given underlying variable change? d,x, as the sum 


: é, = »3 se) . We can use the same idea to designate different business units, 
3 


desks, etc. in the corporate hierarchy. For simplicity, we will drop the product 
label in the following discussion, although it can easily be put back. 

The index a = 1...А will indicate corporate structural components, which for 
simplicity we just call “desks”, or sometimes “business units". Inclusion of 
various levels of hierarchy does not change the logic, only requiring more 


indices. A total exposure К for d,x, (e.g. Libor DV01) is decomposed 


between desks as* 
оу (30.1) 


The result for the total VAR is then obtained exactly as before, for example 
by Monte-Carlo simulation. The increasing levels of sophistication including fat- 
tail vols, stressed correlations, idiosyncratic risks, convexity, liquidity, etc. can in 
principle be included in the same way as we have described earlier. 


? Mergers and Acquisitions: Each M&A can produce big changes in personnel, systems, 
etc. I personally lived through two large M&A events and a number of smaller ones. 


? Returns: Again, the variable x, can be the logarithm of the physical variable to describe 
lognormal dynamics, with the time difference dix, producing returns. 


^ Zero Exposures: Naturally if a desk does not have a particular exposure there is no 
contribution to the sum. However, it is convenient to keep the formalism general. 
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Desk CVARSs and Correlations between Desk Risks 


CVARs and Stand-Alone Risks at the Desk Level 


The risk for a given desk a can be defined by summing over internal variables 
and products for that desk. When the total VAR for the firm is evaluated at the 
specified CL at К, SD (standard deviations), we will get the CVAR for the 


desk?, which we call *CVAR™ . So we write 


$ ууз (8 (а), 
CVAR => En) dx, ), (30.2) 


a=l CL 


Now the state for the total VAR at the specified Кү, depends on the 
exposures from all desks. Hence, the CVAR for desk а also depends on 


exposures pa for all desks fb}. We can see this clearly if we write down 


the quadratic form for the linear-risk case, now summed over desks: 


(‘rapea y = Y Y Oo po, (30.3) 


a,b-la,fi-l 


Then the total VAR at the specified CL at К, SD, over time dt, is given as 
usual by 


SVAR(K,, ,dt) = ke, Уа: 5 VARS (30.4) 


To get * СУАР)" for desk a , we can use the same trick we used before 


to get the CVAR for a specific underlying risk, pulling off the appropriate sum 
A 


A 
so that ° VAR?” = 5 SCVAR2" | We get 


а=1 а=1 


5 Overloaded VAR Nomenclature: The reader is warned that there is no consistency 
between different nomenclatures for VARs. For example, sometimes the “desk VAR” we 
defined is called IVAR with I = Incremental. The name CVAR (Component VAR) as 
used here refers to something completely different to other people. In computer-speak, 
these names would be called “overloaded”. Many unamusing time-wasting discussions 
between people occur, each using a different definition of something. Naturally each will 
think the other is wrong. 
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"СТАВОК = Y Du XD о, [rage (30.5) 
а,В=1 


) VAR?" = 
SCV ARO = 23 set) (30.6) 


We also have the “Stand-Alone” risk, which is just the expression for the risk 
from a desk by itself. The quadratic stand-alone risk ? $4/?9"" is also the desk 


volatility * o'^ , viz 


[to n = (* s4 moe ij = Y 4090 2,96 a, (30.7) 
а,В=1 


Correlations between Desk Risks and P&L Correlations 


Because the CVAR for a given desk depends on the other desks, it is clear that 
correlations exist between desk risks. Note that unless the idiosyncratic risks are 
included as described in Ch. 27 for Enhanced VAR, only “normal” risks will be 
included. Further, various business-related activities are not captured by either 
design or omission. These include commissions, changes in reserves, new 
transactions, etc. Hence, it may or may not be the case that the correlations 
calculated here approximate the actual P&L correlations between desks. That is, 
the actual P&L correlations between desks may have little to do with the 
correlations appropriate for VAR calculations. Customer-related P&L is treated 
separately from trading P&L. 

We can get an approximation to correlations between desk risks using the 
linear formalism. With the above caveats, we denote the P&L for desk a (at time 


t over time dt for given moves fd x, \ in the underlying variables) as 5 PIO, 


where 


spo „у $8) dx, (30.8) 


а=1 


Over time, assuming Gaussian statistics, we have (d Xa d, Xg ) 70,0, p, dl . 


Hence, we obtain the second cross moment for the P&Ls as 
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PLO 5 pE) = Y 190 o eo, -dt, with the usual definition 
а,В=1 
(иу), = (иу) — (u)(v). We define the a,b desk-desk P&L correlation p^? by 


T CER? 5 pr) ) 


p = 
$50 S gO 


(30.9) 


MUT 
where с = ( * piel) . We then get the result 


Y o pao, a 


(a,b) __ о«,8=1 
pi) = OU (30.10) 


and 


A 
( vane" | E У Sg o”) $5 (30.11) 


a,b=1 


$ 
This gives VARÓ"" in terms of desk standalone risks and desk-desk 


correlations. 


Aged Inventory and Illiquidity 


The aged inventory is a set of transactions, mostly with definite and pronounced 
illiquid attributes. We focus here on those positions that have been on the books 
for a long time and are tagged as being part of the aged inventory, and which 
cannot be transacted without potential substantial losses’. The aged inventory is 
naturally of concern to management and is monitored regularly. 


* Alternate Method: There is also a method involving least squares fitting that produces 
desk-desk correlations to fit a given set of CVARs produced by MC Simulation. This 
method also works in the nonlinear case involving a convexity grid. 


7 What is this Toxic Waste, and Why is It There? There are many types of illiquid 
securities. They can be bonds concentrated in lower credits or subordinated bonds in 
corporate or emerging market sectors, illiquid mortgage products, tranches of structured 
deals, etc. Шаша securities can exist for many reasons, including. (1): Securities may be 
left over from underwriting deals that didn't sell out to investors, (2): There may be odd 
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Not all transactions that have been on the books for a long time are illiquid, 
since long-term strategies can exist for liquid markets. Liquid securities held for 
strategic purposes can be sold quickly without loss (except for some perceived 
opportunity cost of abandoning the strategy). Interesting as these strategies may 
be, we are only concerned here with illiquidity. Of course, it is possible that the 
strategy can fail and the result can be that the supposed liquid securities involved 
in the strategy suddenly become somewhat illiquid, and can be sold only with 
some losses. 

The calculation of VAR for aged inventory is uncertain. It is difficult to deal 
with the risk of the aged inventory by its very nature. First, accurate current 
prices are hard to obtain, since by definition illiquid securities are not selling well 
in the market. Sometimes pricing sources will disagree significantly on pricing’. 
In exceptional cases, there may be no price at all. More importantly, there may 
not be direct risk exposure information, e.g. DV01, requiring extra assumptions. 


Variables for Aged Inventory 


e The horizon time t This is the calendar date before which the desk 


Horizon * 


plans to sell the security. 
e The liquidation time interval At,,,. This is the time that it would take to sell 


a security at a given price after the desk decides to sell it. 
e The liquidation penalty Arig: This is the penalty incurred selling into a 


stressed environment. 


These variables are clearly related. Generally (though not always) there will 
be a buyer at a low enough price. Taking longer to sell (larger Дї, ) may give 


more opportunities for sale with a smaller penalty Arig: A later £ can lead 


Horizon 


to a reversal of market factors leading to better liquidity. Of course, 
pessimistically, the reverse can happen. 


Aged Inventory Reports 


An aged inventory report might list a description of the illiquid deals, some 
remarks related to the volume of similar deals trading in the market, pricing, etc. 


lot amounts from deals with customers, (3): There can be a drop in market demand which 
lowers liquidity, (4): There may be a decrease in credit quality that lowers liquidity, etc. 


* Disagreements of Pricing Sources: Occasionally prices can differ widely, e.g. 30% for 
illiquid securities, including some mortgage derivatives, etc. Such pricing uncertainty 
reflects uncertainty in models, softness in the market, etc. 


Chapter 30: Corporate-Level VAR 427 


Quantitative Analysis of Aged Inventory 

The difficulty of performing detailed quantitative analysis should be clear from 
the previous remarks. However, some analysis is possible in some circumstances. 
This can include the following: 


e Statistical analysis can be performed for downward price moves at a very 
high confidence level on the tails of historical data, if relevant or analogous 
data exist. It is not enough just to look at ordinary standard deviations. The 
“fat tail” volatility, which we discuss in Ch. 21, should be used. 

e In addition there can be slow but significant downward-moving price 
movements that are missed in any standard deviation calculation. If these are 
judged important, or for conservative estimates, they should be added. 

e Judgmental scenarios based on proxy examples or analogous situations or 
events can be used. Sometimes such a scenario is all that is available. This 
method requires intimate knowledge and expertise of the local situation by 
the risk manager. 


References 
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31. Credit Risk: Issuer, Counterparty (Tech. 
Index 5/10) 


In this chapter we discuss credit risk. First we discuss issuer credit risk'? for 
bonds or other securities". We will also consider the relation of issuer credit risk 
and market risk, and discuss why and how sometimes these are calculated 
separately. We present a straightforward method of defining a unified credit + 
market risk measure. 

We discuss counterparty risk, including CVA (Credit Valuation Adjustment) 
and PFE (Potential Future Exposure) and point out the difference between risk 
neutral (RN) and real world (RW) dynamics. The counterparty X of a deal 
between X and Y is just the other party to Y. Counterparty risk is the risk that the 
counterparty defaults on some condition of the deal. This is not the same as issuer 
risk. The counterparty to a deal involving a bond can default, but the issuer of the 
bond can be solvent, or vice-versa. 

We connect PFE with the real-world Macro Micro model, discussed later in 
the book. We discuss a correlated 2 dimensional extended Merton default model. 
We also briefly discuss other aspects, including WWR (Wrong-Way Risk), FVA 
(Funding Valuation Adjustment), factor models with idiosyncratic improvements 
for equity counterparty risk, firm-wide credit risk calculations, and regulations. 


Issuer Credit Risk 


An issuer is typically a corporation or a government that issues debt. Issuer credit 
risk for a bond’ is the risk that the issuer of the bond suffers a credit downgrade 


‘Acknowledgements: І thank Jack Fuller and Jim Marker for informative discussions on 
issuer credit risk. I thank Rick Stuckey for helpful conversations. I thank Citigroup for 
providing Moody’s transition/default matrices. I also thank Harvey Stein for insightful 
discussions on credit and many other topics. 


? Incremental Risk Charge IRC: Issuer risk is now classified in the Basel regulations as 
part of “Incremental Risk Charge”. Regulations are changing as of this writing. 


? History: The formalism of stressed transition/default matrices and the unified market + 
credit risk simulation described here was done by me in 2000-01. 


^ Bonds and Loans: We use the word “bond” in the text, but similar considerations with 
somewhat different parameters apply to loans. 
429 
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or default. We treat issuer risk here as a “real world”, with measured risk as it 
occurs for actual defaults and actual downgrades of cohorts of similar entities. 
Issuer risk is not concerned with “risk neutral” market-implied probabilities of 
default obtained through values of credit default swaps (cf. Ch. 9). Risk-neutral 
default probabilities are quite different from real-world default probabilities 
obtained by actually counting defaults. 

Although the rating agencies are not used now in the same way they used to 
be due to events in the 2008 recession, and although models for credit risk 
determination are in a state of flux, it is instructive to frame issuer risk using 
agency information, and this is the path we follow here. 

The determination of issuer credit risk for a particular bond depends on: 


e Тһе starting credit œ of the bond, here taken from a rating agency” ' 
e Тһе probability for credit change p, ,; from credit œ to credit P 


e The probability for default p, ‚шш 


e Тһе recovery rate A for the bond in case of default 
Issuer risk for a portfolio of bonds relies in addition on 


e The confidence level CL assumed for the calculation 
e The portfolio of bonds: its composition, size, etc. 


Given these parameters, the issuer credit risk for a portfolio is the loss? 
determined, e.g., by a Monte-Carlo simulation at the given confidence level CL. 
If sufficient reserves or other considerations exist to compensate for expected 
losses’, the unexpected issuer credit risk is measured from the expected level. In 
this chapter we will mostly be concerned with unexpected issuer credit risk. 

For the expected issuer credit risk, the same calculation is done and the 
average loss is picked out. This expectation can either be the numerical average 


? Credit Ratings: This presentation uses the ratings of rating agencies to define credit 
and (somewhat unrealistically) assumes that rating changes are timely, nominally one 
year. Alternative (e.g. internal) ratings are increasingly being used instead. Additional 
complications arise. For example, the credit rating of a bond in a foreign distressed 
environment will be negatively affected, independent of ratings of bonds by the same 
issuer in non-distressed environments. 


* Currency Units: Although the results for all bonds ultimately have to be expressed in 
the reporting currency, e.g. USD, we do not indicate currency units here. 


7 Reserves and Expected Losses Consistency: This consistency needs to be checked. 
Different groups may determine reserves and carry out the expected loss calculations . 
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or the median, although the interpretation of the unexpected loss is easier if the 
expectation is the numerical average. 


Transition/Default Probability Matrices and Issuer Risk 


The transition and default probability credit matrix contains the corresponding 
probabilities, starting at some time ¢ over a given time period 7 (e.g. 1 year). 
Using the Moody notation Aaa etc., the matrix for general credits о, 8 is 


D Aaa Aaa tes P дааа ii P даа f dud P 4a c D Aga Default 
[o ) = Da Aaa set Pasa eet Pag ux Р.с Das Default 
Pcoaa oo Pesa с Pcog ~ Pcoc PC Default 
(31.1) 


Examples (Bad Year, Average, Worst Cases) 


Historical probabilities are tabulated. For example, for 1990 (a particularly bad 
year), the 1-yr transition/default matrix for U.S. corporates was *: 


1990 Historical 1-year Credit Transition and Default Matrix, Moody's 


[ [Ae а  |A Ва [a в [СааС [Default | 
: 0 0 


Ba | 0.0%] 0.0%] 00%] 2.4%] 78.4%] 14.8%| 07% 
B | 0.0%) 0.6%] 0.3%) 0.6%] 30%] 74.7%] 33% 


Moody has tabulated these matrices since 1970. For example the average 
transition/default matrix over 1970-99 shows less risk than the bad year 1990: 


* Matrices with Sub Grades or lumped Whole Grades: There are also matrices for 
subgrades with more credit-level refinement (and naturally fewer cases per cell). Matrices 
can also be defined with whole grades lumped together. The matrices in the text have all 
C-grade bonds lumped together. 
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Transition & Default Matrix Average 1970 - 1999 


С е je (A [Ваа а в [СааС [pei] 


Ba | 0.0%| 00%] 0.4%| 54% 86.2%] 6.95 | 04%] 1.95 
B | 0.0%| 00%] 0.1%| 04%[ 69% 83.7%] 20%] 6.8% 
67.3%] 25.85 


Here is the number of cases in each cell of the 1990 matrix: 


Number of Cases for Each Credit Transition or Default, 1990 


For a given portfolio in an historical simulation, the worst-case loss will 
result from one of the historical matrices. We could call this matrix the 
“Historical Worst Case” matrix. 


The year for the worst case for a given p,,, matrix element differs 


depending on the matrix element. Using this fact, a "Theoretical Historical Worst 


Case" matrix can be constructed that is even worse than the “Historical Worst 


Case". We first define the quantity p . This is the maximum probability in 


the case of downgrades or default, or the minimum probability in the case of 
unchanged credit or upgrades, measured over time, viz 


max { Pap ()} if Downgrade, Default 
=| * (31.2) 
min { Paos (0) if Unchanged, Upgrade 


MaxMin 
af 


TheorWorstCase 


Then the theoretical worst-case matrix element p; , з 


can be defined by 


normalizing the total probability to one, namely 


Default 


TheorWorstCase _ , MaxMin MaxMin 
D = | У, | pee | (31.3) 
y=Aaa 
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The result of performing these operations is given in the table below: 


Theor. Worst Case (Max downgrade, Min unchanged or upgrade; Renormalized) 


[ — [he а A [Ба [Ba в [СааС [Default | 
| 
24%[ 6.2%| 16%] 25" 
Bo 


Ba [00% 0.0%| 00%] 235 584% 281%] 50%] 6.2% 
0.0%] 68.4% 7.0%] 24.6% 
0.0%[ 0.0% 0.094 100.0% 


Models for Stressed Transition/Default Probability Matrices 


In this section we will be interested in constructing a stressed matrix for a bad 
credit environment away from the average, using a model approach. Since we 
naturally are going to be looking at losses, we want to increase the probability of 
downgrades or default, and decrease the probability of upgrades or unchanged 


credit. To this end, we define the sign flag 77, ,, as 


+1 if Downgrade, Default 
ss (314) 


-] if Unchanged, Upgrade 


We denote as c,,, the historical volatility, defined for each p, ,, , as 


given by the standard deviation of p, ,; (t) over time from the available data. 


Examples of Model Transition/Default Matrices 


As a simple approach to a stressed matrix, we can specify a certain number Kk, , 8 


of standard deviations o for each transition (and similarly for defaults). We 


ap 


then define Op, , в by the following model assumption: 


Ô Pap Cae = max (7, „530, ,5.0) (31.5) 


We use др, , в to perturb the time-averaged transition/default matrix ( P »: 


After renormalization to unit probability, we get the model stressed matrix 


Stressed 
element ру ув as 
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ку" [рлу +.) 3 [Partin] б 
y=Aaa 


For example, suppose K, , в = k =1.0. We get in this case the model 


a Default 


stressed matrix shown below: 


Model Stressed Matrix, # stdev аге Kap = 1.0 and k, petaut = 1.0 


[ [Ae а j Ва ав в [СааС [Default | 
61% 60% 005] 005 
0.3% 
1.2% 


0 0 
Ва [00% 0.0%] 0%] 25%] 81.7% 116%] 139| 28% 
в | oo| 00% 04%] 0.0%] 33% 
0.0% 


Referring back, we see that this stressed probability matrix bears a qualitative 
resemblance to the historical bad 1990 matrix. 


9.3% 


We can generate other model matrices by varying the parameters fka, 3l 


k 


"E 2.3, so each matrix element is taken 


For example, if we set К, = 


at a 99% CL with the cutoff in the model ansatz, we get (after renormalization) 
the following stressed matrix: 


Model Stressed Matrix, # stdev are Kap = 2.3 and К, default = 2.3 


[ [Ae а j Ва в  |B [Cae [Default | 
З 3. . 
А я .19 


% 
% 


M 
Ba | 0.0%] 00%] 0%] 00%] 70.3% 18.2%] 24%] 9.19 
B | 0.0%] 9.05 0.0%| 0| 00%] 633% 61%] 30.675 


This stressed matrix is qualitatively similar to the theoretical worst-case historical 
matrix. 
Actually we have not been trying to fit anything. Better fits could be obtained 


by refining the choice of the parameters {к ү, We know roughly how to 


a 
choose the LN " to get the historically bad and worst-case matrices. Hence we 


can move these parameters around in a sensible to generate many stressed model 
matrices. We can also generate matrices further out on the tail even than the 
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theoretical historical worst-case matrix. Note that there can be smoothing issues. 
In this matrix, the stressed default probability of Aa is larger than that of A, 
which is not reasonable. 


Stochastic Transition/Default Matrices 


A model to generate stochastic matrices can be envisioned as an extension of the 
scenario stressed matrix approach just described. We would need a multivariate 


formalism, including correlations | p, pag = p(à [ЖҮ Pu between 
changes Op, ,,, Pap from the average of different matrix elements. We 


also need to replace LN r with random numbers. Finally we need constraints 


or a parameterization to ensure a rough monotonic decrease of transitions away 
from the diagonal. This just means that at least most of the time, two-level 
transitions should be less probable than one-level transitions, etc. 

In this way we can establish a method to generate stochastic transition/default 
credit matrices. 


Distressed Bonds 


A bond becomes distressed if it has a high probability of default. This is 
characterized by the bond having low credit quality and low price. The reason for 
the price constraint is that a low credit firm can be solvent from a cash flow 
standpoint, so not distressed, with a high coupon bond giving a high price. A 
distressed bond index can be created with these constraints". 


Calculation of Issuer Risk - Generic Case 


We can determine the issuer credit risk due to downgrade and default by 
straightforward simulation. We merely run through the portfolio? one bond at a 
time'^, get the distribution of losses for the portfolio, pick out the portfolio loss at 


? Portfolio Data: Hopefully you will receive pristine data, all the credit ratings will be 
present, current and consistent, no bonds will be missing, and the files will not have any 
formatting errors. In the contrary case, please refer to the Black Hole Data Theorem. 


10 Netting of Short and Long Bond Positions for Given Issuer: All bond positions 
(long plus short) of a given issuer at a given credit rating in the same portfolio are netted 
before performing the calculation. However, a net short position of one issuer behaves 
very differently than a net long position of another issuer, as we shall see. 
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some given confidence level, subtract the average loss, and write up a nice report 
for the management with some pretty color plots''. 

We will explicitly consider a pedagogical simple case of a portfolio of plain- 
vanilla bonds, a single transition/default matrix, a single recovery rate in case of 
default, and well defined distinct spread levels. We generalize to more 
complicated cases below. 


Bond Price Changes and Spreads; the Spread DV0I 
We quickly set up some generic notation. А bond’s price 277 (s) depends on its 


spread 5 = у» — Ул. Where yp is the bond’s yield and y,,,, is the base yield 
(e.g. treasury) at the same maturity. The bond price for credit rating œ is 
B, = 5 (s, ) where s, is the spread for bonds with credit a. 


A bond's spread s,, its credit rating a, and the values of the probabilities 


{ Ps a} are naturally closely related. The change in the bond value ôB, is 


determined by changes in spreads, as we next describe. 
The “spread DVO1” is A, 210 -0Z, /05, with spreads measured in bp 
(or more exactly, bp/yr). The change in value of the bond for credit change 


оа f, with corresponding spread change OSap SSS is? 


pa NEA. ‘OS, . If there is no credit change (а — æ), then 


ó 2g МС Сапе ~ 0 ie to first approximation there is no price change. Actually 


there is a (smaller) price change from the time dependence of the fixed-credit 
spread. We will amplify this statement when we discuss market risk. 


Defaulting Bond Price Changes 
For default (о — Default), the bond loses its value" 27 


a? 


recovery rate A fraction of the notional WV ‚зо 622" РКА Л. 


and it gains the 


'' Color Plots in Reports: Don't knock them. You will be happier if the management 
likes your presentations. 


? Spread Gamma and Full Reval: To capture nonlinear dependence on spread, the 
second derivative of the price with respect to spread change (spread gamma) enters. A 
more accurate representation is to do “full reval” , i.e. call the pricing function with the 
changed spreads. 


? Bond Value Change Under Default: Sometimes the bond is taken to lose value equal 
to its notional and get back the recovery fraction of its notional. This is not appropriate 
for a bond that is marked to market, as in a trading portfolio. For a bond in a holding 
portfolio that is carried at notional value, it is appropriate. More details are given below. 
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Monte-Carlo Simulations of Credit Issuer Risk 


The unit interval is partitioned into segments of lengths { Pos я) and p, „ро 
(the lengths adding to one). Perform Run,, the first portfolio “run”. Given a 
bond ZB with credit o , get a random number (5, ; Run, ) from the uniform 
distribution U (0,1). This picks out a final state (credit Ø or Default), with 
corresponding matrix element p, ,, or p, „р: The change in value of the 


bond, 2, (Run, ) , 15 given by the appropriate case listed above. 
Continuing successively with each bond and adding up the changes to get the 
total portfolio change for the first run, 52 (Run,) = х OB, (Run, ) , Which 


AllBonds 
we save. We repeat the whole procedure for many runs, obtaining the set of 


portfolio changes {5P (Run А үү We then pick out the change SP% at the 


specified confidence level CL for the calculation. 
It needs to be emphasized that this calculation is not the expected credit risk" 


(5 Р ji In fact, the reported risk is the difference between OP“ and (67 ) | 


oP = bP“ -(57) (31.7) 


Transition/Default Probability Uncertainties 


ap 


If we have many possible transition/default matrices ( р) ) labeled by an 
index X, we simply precede each run by throwing the dice to choose a particular 
matrix. For example, the historical simulation is obtained by choosing a matrix 
from some year ¢ at random, ( Р, ув (ї)) to start а run, repeating such a draw to 


get a matrix for each run through the portfolio. If the transition/default matrices 
are generated through a model, the same procedure is adopted. 


" Expected Credit Loss vs. High-CL Credit Loss: Some people are used to calculating 
expected credit losses «dP but not high-CL unexpected credit losses dP“. Intuition is 
quite different in these two cases. Statements like “That bond will never default. I don’t 
believe your calculation" may be appropriate for expected loss but not for high-CL loss. 
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Recovery Rate Uncertainty 


The recovery rate in case of default is not constant in time. If economic 
conditions are bad, less will be recovered because less is available to be 
recovered. The recovery rate also depends on the type of security (e.g. bonds or 
loans), on the seniority of the bond (senior vs. junior subordinated debt), and on 
complex legal issues. The uncertainty in the recovery rate can be modeled by a 


distribution (2) with a draw from the distribution every time a default is 


signaled in the simulation. 


Spread Uncertainty for a Given Credit 
So far we have assumed that a spread s, for a given credit œ is unique. This is 
not true. There is first some uncertainty Os, in the spread at a fixed credit o ata 


given time. There is also a time dependence of a spread over time interval Dt. 
These features can be modeled. In the historical simulation, we choose a spread 


8, (t) at time ź including an extra random uncertainty 05, . Then, for a credit 
transition œ — 8 over time interval Dt, we take the spread change 


д, Sop 


for the final spread s, (t Dt). We consider this further when we discuss 


(t) = 56 (t + Dt) = Sy (t) , also including some extra random uncertainty 


market risk below. 


Short Positions: Why They Don’t Alleviate the Risk at High CL 


We have already mentioned"! that short and long positions of a given issuer for a 
given credit in the same portfolio are netted out to define what we have been 
calling a “bond” in this chapter, before performing the issuer-risk calculation. A 
net short position in a bond contributes to the average or expected risk. However, 
generally a net short position in a bond hardly contributes at all to the 
unexpected loss at a high CL. In a high-CL loss state for unexpected issuer risk, 
short positions do not default. This is because the gains due to the default of a 
short position show up in states that are less risky than the state specified at the 
high-risk CL. For this reason, it is misleading to try to get a “ballpark” 
unexpected risk number for a portfolio by subtracting the short notional from the 
long notional and using some average credit value. 


Concentration (Large Position) Risk 
The calculations for a high CL can be lumpy in an important sense. If we have 


one bond B”“** in a portfolio with an exceptionally large notional, the result at a 
high CL can be that this bond defaults. The concentration risk that this implies 
can be very large, much larger than the loss at a high CL without that bond. 
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Moreover, as we move the CL down, at some point ће B Ние bond will not 
default, and the issuer risk can suddenly jump down to a much lower level. 


Conversely, if we move the CL up, at some point ће "^ bond will default in 
the calculation, and the traders may start screaming. 


Dependence of the Results on Portfolio Definitions 


Naturally if the portfolio with the B7. bond is put into another portfolio with 


other “huge” bonds, the relative importance of ће B” bond is less, and at the 
given high CL this bond may no longer default. 

Therefore the portfolio structure, which may depend on arbitrary definitions, 
can be very important for the calculation of credit risk. We will exhibit this sort 
of effect in the example of issuer risk below. 


Credit Derivatives and Issuer Risk 


Credit derivatives are instruments that pay off under some sort of credit event. 
The event can be a default, downgrade, or spread change... We discussed credit 
default swaps in Ch. 8. Here we mention that risk offsets occur due to the netting 
of credit derivatives with bonds. The degree of risk netting has to be determined 
on a case-by-case basis. Offsets should be included appropriately in credit issuer- 
risk calculations. 


Credit Correlations 


If we try to model transitions and default from an ab-initio perspective, the 
transitions @— / and default о — Default from an initial state œ аге in 


general dependent on the transitions @'— f/' and default @'— Default from 


another initial state q@'. That is, there are correlations between different 
transitions and between different defaults". 

We can calculate correlated defaults of firms ABC and DEF with models. A 
popular model is the Merton model or a variant of it. The firm ABC is assumed 
to default on a particular ABC asset-value stochastic path if the ABC barrier 
(defined by too little ABC asset value relative to ABC debt) is hit. We can run a 
Merton model with correlated stochastic movements in a two-dimensional 
framework of ABC, DEF asset values. The probability for ABC to default is 
correlated with the dynamics of the other firm's DEF value. See later for an 
example of such a calculation. 

If we base the issuer-risk credit calculations on historical credit 
transition/default matrices, credit correlations are not a problem. The correlations 
that existed historically are already built into the historical transition and default 


15 These correlations are not the same as the correlations between default times that are 
critical for tranched credit derivatives. We will not discuss these products in the book. 
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probability numbers. The stressed matrices retain a measure of these correlations. 
Hence, the bonds can be treated as independent in this approach. 


Simple Example of Issuer Credit Risk Calculation 


Here is an illustrative simple example. We take two business units and eight 
“bonds” representing desk portfolios that are investment grade or junk. The rows 
in the table below give the “bond” names, their loss in case of default (in $000), 
their average credit, their probabilities of default, and their business unit names. 
This simple calculation ignores non-default transitions. 


Bonds, Loss if default ($000), Credit, Prob. Default, Business Unit 


There are 256 possible states, where each bond can default or not. Each state 
has a composite probability obtained by multiplication (e.g. 0.0301 if Bl 
defaults, times 0.9986 if B2 doesn’t default, etc). We can generate a pseudo 
Monte-Carlo simulator by listing states and probabilities. We assign to each state 
a fraction of the total number of MC “paths” equal to that state’s probability. We 
list the losses in decreasing order to get the loss for each CL by counting the 
appropriate number of “paths” for that CL (or as close to it as possible). 

Now lets add one bond at a time and calculate the losses at the 99.6% CL, as 
an example. We get the results below: 


Losses for different portfolios successively adding bonds at CL= 99.6% 
B1 B2 B3 B4 B5 B6 B7 B8 


$1,800 - 
$1,800 - - 


ps cms enam Оз. 
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The first bond Bl alone by itself defaults at the 99.6% CL. This is because 
the probability of default is 3.01% for this junk bond, well above this CL 
threshold. Now a curious thing happens. As we successively add one bond at a 
time, the issuer risk at this CL remains the same up to B4. Physically, it is clear 


that a portfolio with (B1...B4) has more risk than a portfolio with just ВІ. 


Nonetheless, at this CL, the risk does not change. When we add B5, only B5 
defaults. This says that the issuer risk has shifted from business unit a to 
business unit b. Adding B6 does not change the issuer risk. When B7 is 
added, B3 and B5 both default, and the issuer risk increases 44% of the 
notional of B7 - whereas the “real risk" of this bond is only 3% of notional, and 
В7 did not default at all. Finally with the full portfolio (B1...B8) , three bonds 


B4, В7, and B8 default. 

How is this possible? The lumpy nature of the portfolio with a small number 
of bonds has exaggerated the problems, but all MC simulations do exhibit some 
of the same characteristics to some extent. The results may be mathematically 
correct, but physically absurd. The trader and risk manager who want to know the 
increase in issuer risk for buying a bond will be misled by the difference of 
simulation results including vs. not including the bond. The simulation results are 
unstable, because of the lumpy nature of the default risk. 


Now lets look at losses of the full portfolio (B1...B8) as a function of the 


confidence level. Here are some results: 


Credit Loss vs Avg CL for 8 bonds 
mLoss 

$200,000 
$150,000 
$100,000 
$50,000 
$- 

КЫ КЫ ES КЫ x КЫ КЫ E 

= = e o o © © 5 

© ES © S © 8 Ф 3 

о о 

Avg CL (%) 
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Actually what we have done is to calculate the average integrated risk 
between confidence levels (and past the 69.1% CL) with the averages shown on 
the graph. See the end of Chapter 27 on VAR for an introduction. 

We can also look at the individual risks of the two business units, the sum of 
the standalone risks, and the diversified risk as a function of the CL. Below is a 
graph of the results. The sum of the standalone losses has to be bigger than the 
loss of the total portfolio. This is satisfied at the upper six CL in the graph, but 
violated in the lower two CL in the graph. While this violation is theoretically 
unsatisfactory, the risks happen to be small here. We examine these 
considerations below. 


Standalone Risks vs Diversified Risk 


m Loss (а) O Loss (b) mLoss(a*b) m [Loss(a)* Loss(b)] 


$175,000 
$150,000 
$125,000 
$100,000 
$75,000 
$50,000 
$25,000 
$- 


Loss(a-*b) 


Loss (a) 


69.196 


aS 
o 
© 
o 


98.0% 
99.0% 
99.86% 
99.97% 


Credit Factors 

In order to sidestep the lumpy nature of MC issuer-risk simulations, a credit 
factor approach can be taken. In this approach, a credit factor f, is used for each 
bond B, with credit a. We then write $5C(B,) = f, -$N(Z, ) for the issuer 


risk $C (Z,) of the bond 2, , with notional $N (2, ). The credit factors can 


be chosen such that the total issuer risk is the same as the issuer risk obtained at 
the desired CL in a MC simulator at a given time with a given total portfolio. As 
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a new deal is added, the additional credit issuer risk is simply added using the 
credit factor. 

The advantage of this approach is that it is smooth, avoiding the lumpy 
character of the MC simulations. It also gives equal risk to the same bond that 
might administratively find itself in one portfolio or a different portfolio, 
avoiding internal arbitrage situations. 

In practice a number of issues arise. These are related to the changing nature 
of the portfolio over time (the credit factors start to lose their association with a 
MC simulation at a given CL), concentration issues (bigger deals should have 
bigger factors), recovery rate uncertainties, etc. Periodic updates have to be made 
to the credit factors. Finally, the normalization (sum of standalones, total 
diversified, etc.) has to be specified. 


Other Refinements (Exposure Changes, Liquidity, Etc.) 


In the same way that we discussed refinements to Plain-Vanilla VAR for market 
risk in another part of this book, we can also discuss liquidity times, exposure 
scenarios, multi-step MC simulations and other refinements for credit issuer risk. 


Issuer Credit Risk and Market Risk: Separation via Spreads 


Spread Uncertainty at Fixed Credit and Market Risk 

The uncertainty 05, in the spread at fixed credit and fixed time along with its 
time dependence d,s,(t)=s,(t+Dt)—s,(t) over time Dt introduces a 
nonzero value in the change of bond value for no credit change, 
5 g “Саве 2 0. We can, purely by convention, call “market spread risk” the 


change in the portfolio with no credit changes and no defaults. This risk is then 
not to be included in the credit issuer risk. 


Spread Gaps Between Different Credits and Credit Risk 

The separation of the total risk into market risk with unchanged credit a >a@ 
and credit risk with credit changes с > 8 and defaults о — Default makes 
sense. First, spreads are affected by many technical market factors having 
incalculable (if any) relations or correlations with credit". Second, spread 
changes for credit transitions are generally much larger than spread changes for a 
given credit either for changing time or for fixed time. That is, 


Ô Sog > 0,5, (31.8) 
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бз; > 55, (31.9) 


Separation of Market and Credit Risk 


Because of the inequalities above, the market spread risk can be rather cleanly 
separated from the credit spread risk. The market risk calculation is done with 
statistics for the spread changes that are not big enough to involve credit changes. 

The risk separation also may be driven from “sociology”, as market risk 
managers and credit risk managers may be in different departments'®. 


Separating Market and Credit Risk without Double Counting 


To account for all the spread risk, we have to include all values of spread 
changes. The relatively small spread changes associated with fixed credit a > a 
over one year do not, arguably, have any credit component. Rating agencies do 
not change their definitions for rating criteria over one year for a fixed credit. 
Although there is a credit component in the spread 5, , this credit component 


arguably cancels out in the difference s, (2+1 yr)-s, (ї). For much larger 


spread changes, credit changes а — 5 do occur. Such very large spread changes 


are associated with issuer credit risk. Since all spread changes (small and large) 
are included, and since any spread change is assigned uniquely, there is no 
double counting. In this way, double counting is avoided. 

We need to avoid double counting. This can be done since the small spread 
uncertainty at fixed credit associated with market risk is much less than the large 
spread gap between different credits giving credit issuer risk, as shown below: 


^ Myopia: Separation of market and credit risk departments can lead to a situation where 
some people may not see the advantage in a consistent unified credit + market risk 
calculation, or fully understand what it means. 
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Market Risk and Credit Risk Separation. Spread 
uncertainty at fixed credit (market risk) is much smaller 


than spread gap between different credits (credit risk). 


| Small spread uncertainty 55, at fixed credit 8 => Market Risk 


Large spread gap ós,, between different credits => 


Credit Risk 


| Small spread uncertainty ôs, at fixed credit а => Market Risk 


Example for High CL Credit and Market Risk 


In this section we give a numerical example. For high credit risk, we need to 
choose a bad year and junk credit. At the same time, the market risk is also high 
since spreads for a given credit then increase a lot. For illustration, we choose the 
bad year 1990, and we consider the transition from the junk credit @ = Ba in 
1989 to the even lower credit  — В in 1990. 

The picture below gives the idea": 


17 Acknowledgement: We thank Citigroup for the use of these spread data. Numbers 
rounded off. 


446 Quantitative Finance and Risk Management 


Large Spread Change in 1989-1990 


Credit Transition from Ba to B 


Бе] ——+ Dm] 


The credit risk for the change in spread for this drop in junk credit from Ba to 
B was thus 850 — 270 = 580 bp/yr, or in $ using the spread DV01, 


$@СС%! Risk = 580 bp/yr А SA DVO1 (31.10) 


Ba>B 


We need to compare this large spread change for change in credit with the 
change in spread at fixed credit due to market risk. For the market risk, we look 


at the volatility o(s 9) of the fixed credit Ba spread over one year. From the 
data between 1986-1999 we get o(s,,) = 100 bp/yr . 


From MC simulations, we know that even at a very high overall CL for the 
total risk, the CL for an individual variable is generally at or less than 99%. 
Hence, for a bad level of risk for spreads at a given credit, consistent with the 


Economic Capital CL of 99.97%, we take the 99% CL, ог К, у & 2.3 SD. The 
stressed market risk for the Ba spread is thus Куусу 0 (Spa ) = 230 bp/yr, or in $ 
using the spread DVO1, 


ФС мае Risk = 230 bp/yr . $A Spread DVOI (31.1 1) 


Individual Variable CL: An individual variable’s CL is generally much less than the 
overall CL for the total risk. In practice, even the CL for an important variable is 
generally no larger than 99%, even with the overall CL being at the Aa-credit Economic 
Capital value 99.97%. See the chapters on VAR earlier in this book. 
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Hence, at stressed levels for both market risk and credit risk, the market risk 
is 40% of the credit risk in this example. There is no double counting since the 
regions of spread change are separated. 


Advantages of Separating Market and Credit Issuer Risk 


An advantage of the separation is that market risk is generally calculated with 
many variables besides spreads (e.g. volatilities etc.) as we discussed in the 
chapters on VAR (Ch. 26-30). Credit issuer risk calculations do not involve these 
variables. Because we want to include spread risk consistently with risk from 
these other variables, the market/credit risk separation again makes sense. 

There are also idiosyncratic risks that the market risk managers may include 
that are not present in the spread data used by the credit risk calculation. 


Disadvantages of Separating Credit and Market Risks 


The first disadvantage of this separation is that at the corporate level, the market 
and credit risks have to be reassembled. Inconsistent assumptions and procedures 
between the market risk and the credit risk calculations can hinder this 
reassembly. In the next section, one way to achieve the combination is shown. 
Another downside is a different kind of double-counting error. This is due to 
administration, rather than anything fundamental. Market risk managers, while 
correctly assessing spread risk for unchanged credit ratings, may not include the 
fact that the probability is not one that the credit does not change. Market risk 
managers may be consistent in their assumption that credit does not change by 
choosing spread changes that are not big enough to be associated with credit 
changes. However, in the end the market spread risk for credit œ needs to be 


multiplied by the diagonal probability p, ,,. 
There are reporting issues. Should the reduction of the market spread risk by 
factors (1 — Diss) be put into market risk? If so, the market risk managers have 


to be concerned with the credit risk calculation of (1 P Pasa). Alternatively, 


since the factor (1 = Pasa ) is a credit factor, should the reduction in market risk 


be subtracted from the credit risk? If so, how do we include correlations? These 
conundrums are absent in the unified credit + market approach, described next. 


A Unified Credit + Market Risk Model 


In this section we show one way of how to resolve the difficulties of the 
separation of market and credit risk while still preserving the advantages of that 
separation. For examples of descriptions and specific aspects, see Ref ". 
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The real problem is that large MC simulators may exist in the market-risk 
world, and separately in the credit-risk world. A composite market/credit dual 
simulator requires a large effort. The calculation we suggest here is done without 
the need to construct such a huge dual simulation. 

We imagine that two MC simulations exist, one for market risk (including all 
variables, not just spreads) and one for issuer credit risk as described above. We 


tabulate all the states of the credit simulation (No) and the states of the 


market simulation d . 


Consider the drawing below: 


Unified Market Risk + Credit Issuer Risk 


Credit Simulator 
States No") 


Market Simulator 
States {erie \ 


Similar spread 


change ds, 


#3,607 


#1,293 


We use spread changes d,s, (t) =S, (t t Dt) =S (t) that are assumed 
present and known in both simulations in order to label the states consistently. 

We then form composite states eee e] , each composite state 
corresponding to a credit loss and to a market loss. We tabulate the total loss 
histogram for these composite states, and find the total credit + market loss at a 


given CL. 
In the drawing above, the state #3,607 in the market simulation turns out to 


have a spread increase d,s, close to that of the state 71,293 in the credit 
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simulation. The composite state containing all the information in both states is 
defined and given the notation ( Mkt g3 s07 , Credit; 293 ) | 

Random selections of market and credit states are used to define the 
composite states. The losses in all the composite states are tabulated and the loss 
at the desired CL is picked out. 

In practice, bins of spread increases have to be defined. The spread used for 
the connection can be a weighted average of spreads, with weights corresponding 
to the losses in the individual simulations. 


Geometry for Credit + Market Risk 


As in the chapters on VAR, there is a geometrical interpretation associated with 
the two types of risk (credit, market). The picture gives the idea: 


Geometry for Credit + Market Risk 


CVAR 


Credit 


C VAR мање 


For a review, see Ch. 29. The quantity с.р is the CVAR volatility, or the 


ez 
Market ,Credit 


uncertainty in the composition of the fixed total risk from the individual (credit 
and market) CVARs. Also, the correlation is Puis dali = Слао ) , as 


we next describe. 
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Effective Correlation between Market and Credit Issuer Risks 
As a final topic, we note that an effective correlation р ый between credit 


issuer risk and market risk can be calculated using the composite formula 
2 2 eff 
O Total m O Market + О credit + 2 раг Credit О Market О Credit (3 1 * 12) 


Here, 0,,,,4,, is the VAR market risk from all market variables, со, the credit 


issuer risk, and оу, the total market + credit risk from the unified calculation. 
It should be noted that usually the logic runs the other way. That is, an ad-hoc 
assumption has to be made for ңы бей in order to get Or; from the 


independent calculations of market and credit issuer risk. 


Counterparty Credit Risk Example: Swaps 


A counterparty of a deal is just the other party to the deal, for example a swap. 
We will look at the risk from the point of view of a broker-dealer BD. The 
counterparty will be called ABC . The counterparty risk for BD is the risk that 
the counterparty ABC defaults on some condition of the deal. 

Counterparty risk can be calculated using multi-step Monte-Carlo (MC) 
simulations of underlying variables, along with models for the securities at future 
times ". The MC simulations move forward in time. 

Potential counterparty default events are built into the simulation. For a given 


MC path, path, , a potential default event causes a loss to BD at future time ¢ 


for any security “in the money" (ITM) to BD at time f, 1.е., for which the 
counterparty ABC owes money to BD. The simulator retains all ITM cash 


flows along each path, . Other non-ITM cash flows are ignored. The potential 


losses are then tabulated for different MC paths at different future times, and 
various statistics (average loss, loss at some CL, etc.) are calculated. 

For example, we can calculate the counterparty risk including default at a 
99% CL at time ¢ by picking out the 100" worst potential loss in a simulation of 
10,000 paths at that time. This loss is discounted back to the present time. 

We can also calculate the potential future exposure PFE that does not include 
default. The PFE is not discounted. 

For illustrative purposes, counterparty risk can be obtained rather easily for 
scenarios, e.g. at the 99% CL scenario envelope of interest rates in the future 
generated by a model of interest-rate diffusion. Here is a simple example. 
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Illustrative Counterparty Credit Risk Example for a Swap 

Consider a new deal, a pay-fixed swap" from the point of view of the broker- 
dealer BD with counterparty ABC, using a scenario (i.e. one path) for 
illustration. Here is a picture: 


Remaining Swap vs Time; 99% CL Scenario 
—m— Swap value —-— Average Swap Value 
$100,000 
o $80,000 | 
= $60,000 | 
2 $40,000 | 
С $20,000 
$- 
о i о i о i о i о 
© >] = = N N e e + 
Time (yrs) 


The swap value along this path starts increasing with time, since BD 
receives the floating rates assumed to increase with time under the scenario. 
Ultimately at maturity, the value goes to zero because the swap disappears. So 
there is a maximum point for the forward swap value on this rate path, occurring 
at around 1.5 years. 

The path is the 99% CL for the break-even rate (2.33 standard deviations) 
with a lognormal volatility of 0.2. The rate starts at 7, = 5% and increases to 


around 13% after four years. The calculation is: 
r(4yrs) = 5% * exp(2.33*0.2* V4) = 12.7% (31.13) 


The picture above shows semiannual forward swap values viewed today (i.e. 
discounted back to today). The picture would look more complicated (a saw- 
tooth behavior) if the values were plotted at intermediate times. The positive 
swap value at a future time f£, called the potential future exposure or PFE, 
represents the risk at ¢ to BD if counterparty default were to occur at £. This is 


12 Swap details: The notional is $1 MM, the maturity is four years with semi-annual payments, 
starting today at par. 
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because if the counterparty ABC defaults on the terms of the swap, BD will 
lose the positive value of the remaining ITM swap payments owed to BD 
(relative to the smaller value of the fixed-rate payments owed to ABC by BD). 

As an approximation, we can use the time-averaged swap value for the given 
scenario, shown by the constant line. This value is called the Expected Positive 
Exposure or EPE. 

The above picture used only one path. More generally using all paths up to a 
given time £ such that the swap has positive value produces an appropriate 
option on the swap (a swaption °°) struck at zero. Thus the potential future 
exposure for a swap is determined by a basket of swaptions”’. 


Loss Given Default (LGD) and Counterparty Risk 


The “loss given default” or LGD, ie. the potential loss in real money if 
counterparty default occurs, is the PFE of the swap minus any expected amount 
(the “recovery value”) that the courts might give on the swap if the counterparty 
actually does default.” The actual counterparty risk at a given time £ is obtained 
by using the probability of default PD(t) of the counterparty at t. The LGD, 


multiplied by PD(t), produces the swap counterparty risk at ¢ in real money. 


Counterparty Risk Valuation for a Swap 


The total counterparty risk on the swap, as determined today, is the sum of the 
losses at the various times {tf}, discounted back from these times to today. 


Collateral Agreements 


The counterparty risk considerations include a Credit Support Annex or CSA 
agreement with the counterparty to post appropriate collateral.” In some systems 
the collateral is not simulated, but the cash value of the collateral is tracked. 


? Swaptions are discussed in Chapter 11. Swaptions can be priced with closed formulas 
for simple assumptions, or by numerical codes for more complicated models. 


?! Story: I did this swap counterparty risk calculation in the early 1990's when I was the 
quant on a swaps desk as a suggestion for risk management, and sent it up the 
management food chain. There was no response. Probably they had no idea what to do 
with it, since counterparty risk was not being evaluated at that time. 


? Recovery: Recovery value models are complicated and depend on the product, the 
legal court situation, and the seniority of the claim. Often simple assumptions are used 
based on average historical experience. 


? History: Although collateral is common now, it wasn't when I started at a swaps desk 
in the early 90's. Then I asked a trader if counterparties needed to post collateral. He 
looked at me and said no, nobody in the swaps business requires collateral (how could I 
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CVA “Risk Neutral"? and PFE “Real World” Simulations 


Risk Neutral (RN) vs. Real World (RW) dynamics play an important role in 
calculating credit risk. There are two issues: (1): RN vs RW simulation. (2): RN 
pricing using inputs from the simulated paths (either RN or RW). 


CVA or Credit Valuation Adjustment - Risk Neutral Simulation 


CVA" is the change in a traded price of an instrument (e.g. a swap) due to the 
possibility of future counterparty default. An example for a swap was given 
above. Since CVA is included in traded prices, it is calculated with risk-neutral 
(RN) dynamics, with parameters from no-arbitrage considerations. The procedure 
uses RN MC simulation to calculate random future states. Random sampling is 
done on these states. Default probabilities are inserted at each time. The losses 
given default are calculated. 

The results are then aggregated and statistics calculated. 

In a path integral context (cf. Chapters 41-45), risk-neutral simulation is done 
using the transition probability risk-neutral Green functions, with bins (at each 
time) defining possible future states of underlying variable within ranges. 


PFE or Potential Future Exposure - Real World Simulation 


PFE” measures the future exposure (and thus potential loss) before default 
probability is considered. PFE is a regulatory quantity used as input for 
determining lending limits for banks. Because a limit is a real-world quantity and 
is not traded, RW dynamics are appropriate for the simulation of future states, not 
RN simulation". 

PFE simulations are done in two stages. In the first stage, the RW future states 
are calculated. The idea is to get the best approximation to the real world in the 
future. In the second stage, for a given future RW state, RN models are used to 
get security values in that RW state. 

To get the idea, think of the state of the world today - it is by definition the 
real world. We use today's RW input parameters as input to RN models. 


not know that). Just another example of changing attitudes toward risk. Counterparty risk 
management then just meant that there were limits by counterparty. 


? Problems with PFE using RN simulation: There are moreover fundamental and 
unavoidable problems with risk neutral (RN) calculations of PFE. In Ch. 43 we note 
Stein's ambiguity in risk-neutral calculations of PFE. That is, PFE is not actually well 
defined for RN dynamics, with different arbitrary assumptions — all theoretically 
“correct” - producing different values for PFE. 

Nonetheless, some institutions calculate PFE using the same RN simulation as for 
CVA, mostly in order to save effort and time. 
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For PFE, we try to replicate this procedure to get realistic RW future states, 
from which RW future input parameters are used as input to RN models. 


PFE and the Macro-Micro Model 


I believe that the Macro-Micro (MM) model would be desirable for RW PFE 
simulations. This is because real world dynamics contain time scale effects that 
closely resemble those of the MM model. The MM model can closely 
approximate real-world changes, and the parameters for the MM model are taken 
from real-world historical data. See Ch. 47-51 for details. 


Correlated Defaults — Analytic Results for 2D Merton Model 


A complication for modeling probabilities of default is that correlations between 


defaults os between two entities ABC,DEF can be important. Here 


we give some analytic model results. Specifically, we calculate correlated 
defaults approximately in an extended two-dimensional Merton model using the 
ABC,DEF 


Equity 
correlations Pe cs We consider the (x, y) plane with ABC knockout for 
default for simplicity at x 2 0 and the DEF defaultat y — 0. 


There are three steps in the calculation. 


formalisms in Ch. 17, 19 '", Equity correlations р induce default 


st . . : ABC,DEF _ 
1" Step: Temporarily assumes zero correlation, PE quity =(). Three 


ordinary 1-D barrier images in the (х, y) plane are inserted (cf. Ch. 17), like a 


quadrupole: in the 2" quadrant at aa , »). in the 3" quadrant at 


mage). y= pe), and in the 4" quadrant at (x, y= jm The 2D 
Green functions with the three images are: [subtracted (2 Quadrant), added (3 


Quadrant), and subtracted (4" Quadrant)] from the 2D Green function (1“ 
ABC,DEF 

Equit = 0. 

quity 


(x=x 


Quadrant), all with p 


ABC,DEF 

Equity 

exactly satisfying the boundary conditions. 
Below is a picture of the 1“ step at fixed time. 


This gives the exact p =Q solution of the 2D diffusion equation, 
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y axis, х = 0 


2™ Quadrant 1" Quadrant 
(x"29, у) (x, y) Physical Region 


x axis, y=0 
3" Quadrant 4" Quadrant 
(x mago. ymago (x, ymago) 


157 Step of Calculation 


2"* Step: The 2" step uses a perturbation to go from zero correlation to the 


physical non-zero correlation [und + 0. To this end, the 2D hybrid barrier in 


Ch. 19 is used twice: (1) with nonzero correlation йб #0 (the 2D Green 
function including the image is added), and (2) with zero correlation 
ABC,DEF 
Equity 
zero correlation result from the 2™ step approximately cancels the zero 
correlation 1* step result. 
The hybrid barrier physical region in Ch. 19 is the upper half plane, but we 


= 0 (the 2D Green function including the image is subtracted). The 


need the first quadrant of the (% y) plane. In order to produce the desired 
geometry, new variables (u,v) are used so that the w=u+iv upper half 
complex plane is mapped into the first quadrant of the z = х + Ту complex plane 


using the square root transformation z = yw. The x 20 and y = 0 axes after 


the transformation represent defaults of ABC and DEF. 
Below is a picture illustrating the 2™ step of the calculation at a fixed time. 
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—— 
Z = sqrt(w) 
w = (u, v) plane z= (x , y) plane 


1“ Quadrant 


Upper Half | w plane z plane 


2" Step of Calculation 


The result for the 2D Green function after the transformation is a multiple 
convolution (actually a local volatility path integral, cf. Ch. 42); the same is true 
for the image Green function. This is exact because the square root 
transformation is a conformal transformation. 

The result can be approximated to get a tractable analytic result. The 
approximation satisfies the initial condition and the boundary default conditions 
exactly, and the 2D diffusion equation approximately”. 


3"! Step: All terms from the first two steps above are added (with the correct 
signs for the images). 


The Eight-Fold Way Analytic Result for Correlated Defaults 


The final result has eight terms”. There are four terms for the 1“ step (the 2D 
ABC,DEF 
Equity 

2" step (the usual 2D Green function and its image for nonzero correlation 
ABC,DEF 


= 0 Green function and the three images). There are four terms for the 


Prquity 7 0, and the usual 2D Green function and its image for zero correlation 
ABC,DEF _ А ABC,DEF _ ; 
EN. 7 0). The zero correlation Pug = 0 terms approximately cancel. 


The finite number of terms (eight) in the analytic approximate solution is to 
be compared with the infinite number of terms in the formal solution series of 
eigenvectors of the 2D diffusion equation (which must be truncated in practice). 


? Numerical Example: See ref. viii. 


2 Bad Joke Flag: This solution has nothing to do with the Eight Fold Way in physics or 
anything else. I just use the name to remember the number of terms. 
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Probability of default calculations using the model 


Once the full model Green function is obtain, the probabilities of default for one 
or both entities over a given time can be found by straightforward integration 
along with sum rules similar to those in Ch. 18. 

For many entities, the model is applied to each pair, in all combinations. 


Correlation between Default Times 


The model produces correlations between the default times of two entities (if 
both barriers are hit). If input correlation increases, the correlation between 
default times increases because both barriers tend more to be hit at similar times. 


Misc. Topics: WWR, FVA, Factor Models, Firmwide Risk 


Wrong Way Risk (WWR) 
WWR" is the risk in a particularly bad set of future states. WWR is the Murphy’s 
Law of credit where everything goes wrong at the same time, magnifying risk. In 
the WWR region, exposure to a counterparty increases (more risk) and also that 
counterparty’s credit gets worse (more risk). A calculation of WWR thus depends 
on correlations between credit changes and underlying variable changes (on 
which counterparty exposure changes depend). 

To get the idea, consider a hypothetical calculation where at each historical 


time p? we record the values of all relevant underlying variables x(r m and 


also credit measures (e.g. spreads or probabilities of default) for all entities, as 
discussed above. Changes of all quantities over each historical time interval can 
then be used to obtain the credit-exposure correlations. As for Monte Carlo 
HVAR (Ch. 26), a simulator can choose random changes from the historical state 
changes (underlying and credit). The WWR region (exposure increases and credit 
gets worse) comprises a subset of such states, and WWR can thus be extracted. 


Default correlations between different entities B е 


Credit Default 9T€ а by-product in 


this procedure. 


FVA or Funding Value Adjustment 


FVA * is the change in the price due to funding costs, which depend on credit. 
For example, in Ch. 14 we qualitatively considered the price sensitivity with 
respect to different interest rates, due to credit, in convertible product models. 
The FVA formalism is complex. FVA considerations include collateral, 
which must be funded. All cash flows, including corporate treasury, repo etc. 
must be tracked, along with funding rates (which depend on paying or receiving). 
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Here are some remarks for a swap to give the idea. Recall the picture of a swap 
along one path above, and see Ch. 8 for an introduction to swaps. 


Funding and Backward Induction for FVA 


Future funding costs for future cash flows {СЕ \ of the swap are evaluated, and 


these future funding costs are added to the future cash flows. At time points along 
those paths requiring collateral, funding for the collateral is needed. The swap 
now becomes path dependent. 

The general formalism including FVA involves backward induction, as for 
American Monte Carlo (cf. Ch. 44). At the last time step before the end of the 
deal, the situation simplifies enough so that the funding can be evaluated for each 
path along with the price, at that time. This information is propagated back to the 
2"..to-last time step before the end of the deal, at which time the funding and 
price can be evaluated for each path, given the information from the last time step 
and the transition probabilities for underlying variables between ће 2"-to-last 
step and the last step. We continue this process back to the beginning of the deal. 

The credit and funding dynamics are coupled. Funding at a given time 
depends on the default status of the deal at that time, along with the contract 
conditions for cash flows if default occurs. 

For path-dependent deals a forward simulator is needed to get the cash flows 
in the first place, along with backward induction to get the funding for the cash 
flows. Any life-cycle event on a path is recorded, and if default occurs the path is 
terminated. 


Idiosyncratic Factor-Model Improvement for Equity Counterparty Risk 


For equity and other counterparty risk simulations, simplifying assumptions may 
be made to reduce complexity by introducing factors, to which the (e.g.) equity 
returns are correlated. Ch. 25 describes work on improving the consistency of the 
models with the data for equity return correlations. This is done by incorporating 
correlated idiosyncratic / residual terms that are not normally considered. 


Firm Wide Counterparty Risk Issues 


To deal with firm-wide counterparty risk, MC simulation is run for the possibly 
thousands of deals in the various portfolios for different counterparties. The 
calculations generally have to be done over a long future time (e.g. 20 years), 
corresponding to the possible times that counterparties can default in the future, 
until the maturities of deals. Appropriate netting logic has to be incorporated for 
clearly offsetting positions." Such counterparty risk calculations can be a huge 


27 Netting is a complicated legal subject. Netted cashflows for a portfolio of swaps can be 
treated together for risk assessment. 
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enterprise, both for computation and for the collection of the data. On a firm- 
wide basis, because securities depend on many types of variables, the simulator 
includes multivariate statistics for the generation of the paths. Both risk-neutral 
(CVA) and real-world (PFE) simulations are relevant, as described above. 

Models for the securities are often very computationally intensive for such 
simulations. Simpler models are sometimes used especially for options. Recently, 
“American Monte Carlo” or AMC has been used in counterparty risk. We will 
discuss AMC later in the book (cf. Ch. 44). 

“Life-cycle events” have to be included in the counterparty risk simulation 
for those options that exercise (including swaptions that can exercise into forward 
swaps, barrier knock-ins or knock-outs, etc.). 

Collateral also needs to be included, and if not modeled explicitly, at least on 
an equivalent cash basis. 


Regulations 


Regulations" are complicated, and moreover they keep changing. Basel III 
(following earlier Basel accords) deals with bank capital adequacy, stress testing, 
and market liquidity risk. The Fundamental Review of the Trading Book or FRTB 
for capital standards is currently under consultation, including refinements of: (1) 
Clarification for aspects of the “trading book" and the “banking book". (2) 
Liquidity horizons (cf. Chapters 27, 30). (3) Calculations of instrument 
sensitivities. There are different regulations (including stress tests) applicable to 
large and small institutions, and different regulations by geography. We cannot 
keep this book finite and discuss also regulations at any length. 
Regulations will keep an army of IT and quants employed for a long time. 
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32. Model Risk Overview (Tech. Index 3/10) 


This short and non-technical chapter contains some observations on models with 
an emphasis on risk. Model Quality Assurance will be treated in the next chapter. 
You can read this chapter without having to read the rest of the book. 


Summary of Model Risk 


We start with the obvious comment that models are now an indispensable part of 
modern finance. Securities and derivatives require model pricing. Therefore, 
models are indeed indispensable. Finance could not live without them. 

Nonetheless, in spite of the best efforts of many talented and smart people, 
model risk is constantly present at some level and is due to many causes. One risk 
is the variability in model assumptions, none of which can be proved in any 
rigorous way regardless of the mathematical sophistication with which the ideas 
are presented’. Some important effects on prices can be modeled only 
imperfectly, if at all. No financial model has the status of a “law of physics”, 
even if physics-based diffusion models and other concepts are used. Further, 
there is no “best” model, regardless of whose ego is involved. Different firms can 
and do have different models for the same instrument. 

Model risk is hidden unless model-to-model or model-to-market comparisons 
are made. The risk is highest for illiquid, long-dated options. For highly liquid 
instruments, models are standardized with slight variations. Substantial losses 
due to model risk have occurred even for plain-vanilla products, however ?. 

Model risk includes the risk of using approximate or inappropriate parameter 
types, or using the model in inappropriate parameter regimes. Models are used in 
practice to parameterize securities in some approximate way, and are usually only 
to be trusted for some short extrapolation from the region where market data are 
available. These parameters include time to maturity, strike values for options, 
etc. 


' Rigor Mortis? The model risk problem is not alleviated by mathematical rigor applied 
to models, since models contain non-rigorous assumptions that cannot be proved. 


? My Favorite Options Model: My favorite options model is the theory as described to 
me by a trader as “Picking up dimes in front of a steam roller". 
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The intentional use of inaccurate parameter values is a separate problem’. 

Numerical approximations are unavoidable and are a function of available 
time and resources, but they can lead to difficulties. 

Part of model risk lies in the pitfalls of software development. See Ch. 34. A 
host of mundane but important issues exists: coding errors, computer 
malfunctions, misinterpretations, communication snafus, inconsistencies, etc. 
Anyone who wants to get an idea of the difficulty is invited to sit down at the 
computer and give it a іту“. 

Model-generated hedging predictions are more problematic than pricing, 
since hedging involves taking differences of prices under changing market 
conditions. Models differ more in the hedges than in the prices. 

Model risk issues have arisen in a number of contexts. The interested reader 
is invited to consult the literature and the references’. 


Model Risk and Risk Management 


Models stand at the cornerstone of risk management. Models characterize the 
behavior of financial instruments under different possible environments. This 
information is used to determine the risk of these instruments, and thus of 
departments, and ultimately of the corporation with respect to the markets. We 
have exhibited a variety of model calculations. We need to understand the model 
limitations. These limitations translate into a risk associated with the very models 
used for assessing risk. Model risk results from model limitations. 


Time Scales and Models 


Financial markets exhibit different behaviors at different time scales. No model 
used today accurately describes the dynamics in all the various time-scale 
regimes. The incorporation of these time scales is the biggest challenge in 
financial modeling. Models often assume that functions of underlying variables 
(interest rates, stock prices, FX rates) follow some sort of Brownian or random 
walk diffusion with constrained drift. In the real world, the parameters are often 


? Operational Risk: We do not count fraud due to intentional mismarking by deliberately 
choosing off-market parameters as model risk. This would be placed in operational risk. 
If model risk is counted as operational risk (it has to go somewhere), then fraudulent use 
of the models should be listed separately from the quantitative risk discussed above. 


^ Homework: Experience is the Best Teacher for Constructing Models and 
Prototype Systems: Try putting together some model code into a prototype system, 
including input/output, a GUI etc., that you design and program yourself. You will learn a 
great deal about models, systems, data, perseverance, and probably life itself. 
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hard to determine or identify. Moreover, many effects break the Brownian 
assumption. Manifestly different dynamics in financial markets occur at different 
time scales: short, medium, and long. In principle, this means that risk 
management on different time scales should be different. 


Mean Reversion 


Time scales that are often used are connected with mean reversion in a Brownian 
framework. Mean reversion can depend on a number of market variables, which 
can change from one period to another. This includes the disappearance of mean 
reversion altogether under adverse conditions. This happened for example in the 
fall on 1998, to the consternation of some institutions (e.g. the famous hedge fund 
LTCM) that had placed large leveraged bets on mean reversion. Of course the big 
question with mean reversion is: “Reversion to What, When?” 


Jumps, Gaps, and Nonlinear Diffusion 


Clear violations of Brownian motion with or without mean reversion occur on 
short time scales with "gap or jump" behavior. Jumps are often parameterized by 
a Poisson distribution. Jumps have to do with the response of traders to some sort 
of important unexpected news. We have no reliable way to model jumps. 
Actually, jumps may arise from collective non-linear phase-transition effects over 
short times. Possibly non-linear diffusion including a modified version of the 
Reggeon Field Theory (RFT) used in elementary particle physics may be helpful; 
this approach generalizes Brownian motion. The RFT is described in Ch. 46. An 
alternative approach involves the non-linear dynamics of chaos. 


Long-Term Macro Component with Quasi-Random Behavior 


Risk management for long-term securities, or over a long term, need to deal with 
realistic dynamics over a long term. Violations of Brownian motion occur on 
long time scales with "quasi-equilibrium random macro" behavior. Here, 
macroeconomics affects financial variables in a smooth but changing fashion, 
which explicitly involves time scales on the order of months or more. This 
"macro" behavior dependent on macroeconomics cannot be described with 
Brownian motion. We describe the “Масго-Місго” model in Ch. 47-51, which is 
targeted at coping with these issues. 


Liquidity Model Limitations 


There is a variety of other problems not included in models. Effects related to 
supply and demand, the trading volume, and the time needed to sell a security are 
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lumped together into "liquidity", and there are no good way to model these 
effects. Often they are just left out of the model price. 

Bid-offer spreads are a related issue. If there is only a one-sided market so 
that (for example) the selling price is not known, then these spreads must be 
estimated independent of the model algorithm. 


Which Model Should We Use? 


Because there is no real theory of finance in the sense of physics, financial 
models are not unique. Different institutions, especially for illiquid financial 
products, often use different models. If one model is used in place of another, the 
differences in the values of the securities and the differences in the sensitivities 
with respect to movements in the underlying variables become an issue in risk 
management. For example, if one model reports the interest-rate dependence of a 
partially hedged position is near zero while another model just as sophisticated 
and defended with at least as much exuberance reports that the interest-rate 
dependence of the same position is large, which statement do you believe? 

Sometimes a proprietary desk model is used for trading, and another model 
with simpler assumptions but widely used on the Street is used for corporate risk 
management reporting. Which model should be used to measure risk? The real 
answer is that there is model risk. Different models give different results. 
Therefore, there is an uncertainty in risk reporting due to the very existence of 
different models. A corporate goal should be the quantification of this model risk, 
creating an uncertainty of risk management itself. 


Psychology and Models 


Adding Psychology to Option Models — A First Step — Beilis/Dash/Wise 


Psychology drives much of the market for options. For example if the market 
sentiment is something like: "/'m buying puts right now because I’m scared of 
the market", the price of puts will go up due to increased demand. Yet there are 
no explicit “psychology dials” in models for changing the amount of “fear”, 
“exuberance” etc. Volatility models have parameters and equations used to fit 
market option price data, but these models have no explicit psychology dials. 

Beilis, Dash, and Wise (BDW) took a step towards inserting psychology 
approximately into option pricing". Of course psychology is hugely complex; this 
work considered one aspect of psychology in an idealized setting. 

There are three stages in the BDW model. In the first stage, a trader makes a 
decision to sell, and experiences regret if she feels ex-post that the ex-ante 
decision was suboptimal. Anticipated regret captures fear that traders have about 
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making a suboptimal decision. The quantification is through a particular utility 
function with a “regret” term in a behavioral finance framework. The regret term 
impacts the selling pattern of traders; those who anticipate regret sell more 
quickly than others. 

The second stage observes that the modified trading behavior with regret 
implies extra drops in the stock price. This is quantified using a model of 
Almgren and Chriss, applicable to stock options or FX options ™. 

In the third stage, effects on options prices from these price drops are 
quantified. Changes in options prices can be expressed in several ways: (1) As a 
modified effective stock dividend for stock options, (2) As a modified effective 
foreign exchange rate for FX options, (3) As a modified implied volatility with an 
unusual additional term, or (4) As a suitable market price of risk. 

Some FX options data were examined; the additional implied parameters due 
to regret facilitated the numerical description. 

The main idea here was illustrative, breaking ground in a new direction by 
connecting behavioral finance with options modeling, providing a definite model, 
and showing that this model helps fit some options data. 


Psychological Attitudes towards Models 


The psychological attitudes toward models are not to be ignored. People who do 
not understand the limitations of models ask for the "best" model, and some 
people who should know better may believe they have the "best" model. Some 
people trust the models to such an extent that if their model disagrees with the 
market they assume the market is "wrong" and will eventually agree with the 
model. Sometimes this attitude pays off and sometimes it results in disaster. 
Sophisticated players understand and even fear the limitations of the models, 
sometimes using them only as a guide in difficult markets for illiquid products. 


Model Risk, Model Reserves, and Bid-Offer Spreads 


Models used for risk management are themselves risky to some extent. A 
corporate reserve could be taken to account for this model risk. This is difficult to 
convey to accountants, who want to know exactly how much the model risk is 
and exactly when or under which conditions they should apply the reserve. 
Because model risk will really show up when relatively illiquid positions are sold 
in difficult market conditions under pressure at someone else's model price, the 
risk is hard to quantify. Still, model risk is not zero and may be very large. 

Alternatively, if known, model risk can be used to estimate part of the bid- 
offer spread for illiquid products. 
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Model Quality Assurance 


The best way to quantify model risk is through a model "quality assurance" (QA) 
program (c.f. Ch. 33). Model QA now exists in most large financial institutions, 
although model risk is generally only determined in an incomplete fashion. 
Model assumptions and procedures are documented. Sensitivities to different 
parametric assumptions can be examined. Models can be assessed and compared. 


Models and Parameters 


Because there is no financial model that proceeds from first principles that are 
unambiguously correct, models are largely driven by parameters. These 
parameters are chosen through a combination of somewhat conflicting goals. The 
parameters are chosen such that the model into which the parameters are placed 
produces prices that, at least approximately, fit selected market data. The difficult 
problem is for cases where there is little or no market information. Models differ 
partially because the market constraints placed on the models can be chosen in 
different ways. The number of parameters is a compromise between fitting the 
known market prices and unwieldy complexity. 

Having chosen the parameters to fit the market approximately, the models 
can be viewed essentially as providing an extrapolation or interpolation 
methodology. Thus, if a deal comes up that has parameters not currently quoted 
in the market, which is often the case for over-the-counter deals, the model 1s 
used to derive a price for that deal. The models, through extrapolation or 
interpolation, price illiquid instruments in a portfolio. 

It should be emphasized the types and numbers of parameters in reality form 
an integral part of a model. In a profound sense, the parameters cannot be 
separated or isolated from the assumption of the underlying dynamics and the 
implementation of the mathematics through some computer algorithm’. For 


^ Model Quality Assurance or Model Validation? Because no financial model is 
"valid" in a real scientific sense, we believe that the appellation "Model Validation" is 
inaccurate and conveys a false sense of security. There can be different models giving 
different estimates of the risk, which however are all “validated”. However, what is 
important is that some form of the activity gets done. See the next chapter. 


° Types vs. numerical values of parameters: The types of parameters form a part of ће 
model. The numerical values of the parameters, as distinct from their types, are dictated 
by the external market conditions and do not form part of the model. The distinction 
between the types of parameters and their numerical values is sometimes not well 
understood and can have significant consequences. For example, the numerical values of 
parameters chosen can be audited by a specialized "rate reasonability" group while the 
computer algorithms can be checked by a model quality assurance group, but the types 
and numbers of parameters chosen in the first place 1s a separate issue, which if not 
examined can lead to a gap in the control process. 
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example, if one volatility is used to describe a diffusion process rather than 
several volatilities, this is a model assumption. In fact, some models are just 
shells into which complex parametric functions are inserted. The Black-Scholes 
equity option formula is currently used in practice with a breathtaking richness of 
the parameterization of the volatility "surfaces" describing different options 
including “skew” effects’. The volatilities cannot be separated from the 
assumption of simple diffusion and the algorithms used. 
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33. Model Quality Assurance (Tech. Index 4/10) 


We have discussed models for various markets and purposes, and we have spent 
some time talking about model risk. In this chapter, we deal with procedures and 
activities designed to cope with some aspects of model risk', variously denoted as 
“Model Quality Assurance", "Model Review”, or "Model Validation". We use 
Model Quality Assurance or Model QA for short. This is partly because the term 
*Quality Assurance" is used across the software industry'. 

Whatever it is called, the idea is (1) to reduce model risk, (2) increase the 
understanding of models for Risk Management and possibly for the desks 
themselves, (3) reduce personnel risk, (4) produce documentation, etc. Although 
the beginnings were sometimes quite rocky’, regulators and official bodies now 
require such activities". Groups of quants performing Model QA now exist across 
the industry’. 


Model Quality Assurance Goals, Activities, and Procedures 


A summary of Model QA goals, activities, and procedures follows. This is a 
representative list, but different policies exist. 


' History : My involvement with Model QA started around 1995 at Smith Barney, and 
later in 1997-99 at Salomon Smith Barney, where I instituted and ran firm-wide Model 
QA for Fixed Income Derivatives, Mortgages, Equity Derivatives, and FX Derivatives. 


? Stories - The Trader’s Flying Putter and the Flying QA Model Doc: Part of these 
early pioneering efforts involved selling the idea to indifferent or frankly hostile 
audiences on the trading side. A good war story involved a head trader, who exhibited his 
objections in a meeting to some aspects of Model QA by throwing a putter in my 
direction. It hit the wall and missed me, but not by much. Another trader was reported to 
have thrown the QA framework document outlined in this chapter around the room. The 
basic trader argument against model QA was that they were smarter than me, understood 
the models better than I did, and so why was I wasting their time. It wasn’t subtle. 


Starting up a Model QA program took courage. No regulators to run interference. 
People these days have no idea. 


? Acknowledgements: I thank members of my Quantitative Analysis Group: A. Beilis, J. 
Castresana, T. Gladd, and A. Lapidus, for their diligent work while this group was in 
charge of performing firmwide Model QA at Salomon Smith Barney. I thank the 
Salomon business-unit desk quants, systems people, and risk managers for their co- 
operation in making the SSB Model QA effort a success. Finally, I thank E. Picoult and 
L. P. Chan, who once lead Model Validation at Citigroup, for discussions. 
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Desk-Level Model QA 


Some (even extensive) model QA is done by desk quant model developers. 
Model testing in various situations takes place before the model is put on the 
desk. The traders then use the model in the real world, comparing prices to the 
market, noting whether the predicted hedges are or are not realistic, etc. Battle- 
hardened models are developed over years of use. Desk quants may have the 
attitude that, since they understand the models (perhaps better than anybody else) 
and have already tested them, model QA performed by an independent group is 
redundant. In a sense, the desk quants can perform the activities of a model QA 
group. They have a point. When I was developing desk models or supervising 
their development, I had a bit of the same attitude. However, independent Model 
QA—which is the real topic of this chapter—is indeed useful. 

If vendor models are used, desk quants will generally perform some model 
QA activities to test the vendor models. 

Pricing/risk system vendors now produce some model quality assurance 
(validation) documentation. 


Independent Model QA Group 


Model QA at a corporate level is often performed by an independent group of 
quants, usually PhDs in engineering, science or math with some finance 
background. This group is not under the control of the business units. The 
personnel have to have enough background to understand the models. Depending 
on the situation, they may reproduce the desk models or produce alternative 
independent models. Programming skills are essential. Systems support is 
desirable, although not always present at the desired level. 

A hybrid version is to use desk-level QA to do the work, with the 
independent quant group performing a control or supervisory role. 


Model Risk Testing Using Various Independent Models 


Our philosophy is that perhaps the most important issue is to quantify model risk 
through comparison of the results of different models. Then, when firm A (using 
its best model) has to sell its inventory to the street, and firm B buys this 
inventory (using its best model, probably different from A’s model), part of the 
bid-ask spread is the difference in the model valuations. 

In order to get a handle on this issue, the independent Model QA group can 
either construct or otherwise obtain models that are different from the desk 
models. One way of proceeding is to use relatively simple Street-standard “Ford” 
models as the independent models, to compare with the whiz-bang “Mercedes” 
proprietary models on the desk. This is actually reasonable, since if the desk is 
forced to sell its inventory (acting as firm A) then the buyer (firm B) may indeed 
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be using a Street-standard model and so the desk may be forced to sell its 
inventory using its Mercedes model at the level given by the Ford model". 

It is a good idea to quantify the results from different models using “Diff 
reports". These are simply catalogs of results comparing the output from the 
models, day-by-day, for a representative set of deals’. The differences should be 
noted for the hedges, not just the prices. 


Test Suites 


Test suites of examples are useful to run periodically for control purposes. This 
activity will indicate 1f any model changes have occurred. 


Documentation, Personnel Risk, and All That 


With some notable exceptions, typical desk quants are “too busy” to document 
models. From their point of view, since they understand exactly what they did, 
the value of documentation is low. From a corporate point of view, the value of 
documentation is very high. One reason is personnel risk. When the quant leaves, 
who understands the model? ° 

The documentation should include full disclosure of the models. This 
includes the theory, equations, parameters, numerical techniques, and practical 
utilization examples. Model limitations should be noted. Model inaccuracies, 
when known or found, should be listed. Software design criteria, including code 
comments’, should be noted. 

The documentation should be made available to corporate risk management. 
This is so that the risk managers can better understand the models producing the 
risks with which they are dealing. 

Proprietary concerns that the desk might have are fully justified". Therefore, 
appropriate security measures should be taken. 


^ Question to the Professors: OK, so in this case, which model is Right? 


? Systems Help? If you're in a QA group and are lucky, maybe you can get some help 
constructing such reports in an automated fashion from the friendly systems group. 
Otherwise, you may be reading computer files in obscure formats, cutting/pasting, etc. 


° New Head Desk Quant: A variation occurs when a new head quant arrives. 
Documentation of the existing models can be very useful in that case. 


7 Code Comments: Of course, you always document your code with explicit, easy to 
read, complete sentences in every code section, don't you? 


* Disclaimer for Proprietary Models: Although I obviously know a lot about Salomon's 
proprietary models from having run firm-wide Model QA at Salomon Smith Barney, no 
information is in this book regarding these models. 
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Documentation by the independent group by definition means that the 
independent Model QA quants need to talk to the desk quants. This should be 
done with mutual respect. One way to establish a good rapport is to have regular 
time-limited discussions, where the agenda is understood in advance. The time 
allocated to the meeting should also be specified. The meetings should take place 
in a quiet office, off the trading floor. 


Independent Reproduction of the Same Model 


One way to understand exactly what is in the desk model is to try to reproduce 
the desk model in the independent Model QA group. The QA quants reproduce 
all equations independently. Reproducing (i.e. recoding) the model computer 
code is an important part of this philosophy. This is useful in that it can reveal 
model limitations or assumptions perhaps not otherwise evident (or understood). 
It also controls for software coding errors. Coding models is generally a very 
time-consuming task. One issue that arises is how model QA should be 
performed on the recoded models by the independent QA group. 


Judgment on Assumptions, Methodology, Algorithms 


The independent Model QA quants should have enough experience and 
sophistication so they can form independent judgment on relevance and 
reliability of various model assumptions, methodology, numerical algorithms etc. 


Which Models need Model QA? 


Generally, the models that produce hedges that affect corporate risk management 
should require independent Model QA, as well as the models that produce prices 
for the official books and records of the firm. 

Prototype models not yet in production, trading models used for decisions on 
the desk, quick calculators in spreadsheets, etc. may or may not fall under Model 
QA, depending on policy, resources, negotiations, etc. 


Model Testing Environment 


If possible, it is good to quantify model risk present for pricing and hedging 
through comparative testing in a “real world" context. Portfolios of representative 
securities should be used with realistic input parameters. Testing should occur 
over a period of time’. 


? Story: My quant group performing model QA once spotted an unannounced model 
change on a desk through a big change in valuation differences that was reported one day. 
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Feedback to the Business Units 


Maybe the independent quants will find out something that the desk didn’t know 
about its models. Feedback to the desks is therefore a positive idea. 


Model QA: Sample Documentation 


Here is a sample suggestion of what might be considered as ideal and extensive 
Model QA documentation. The objective here is to illustrate the issues raised in 
model development and systems implementation. This documentation is much 
more thorough than what is usually done. However, such a document could be 
relevant for complex models. Most models only need a fraction of this 
information. 

The format below is an outline to be filled out. Other formats naturally can 
accomplish the same ends. There are three sections: 


e User Section 
To be filled in by the users of the model (trading, sales, etc.) 


e Quantitative Section 


To be filled out by the Quantitative Group designing the model, and (when 
applicable) coding the model. 


e Systems Section 


To be filled out by systems personnel involved with model coding, integration, 
maintenance, etc. 


User Section of Model QA Documentation 


UI. Model Usage Specification 


UIA. Trading and Sales 

e List the trading applications of the model, if any 

e State the context in which the model is used. 

UIB. Risk Management Reporting 

e Specify if the model is used for risk management reporting, 
especially if different from the model used for desk hedging. 

UIC. Risk Management and Desk hedging 

e Specify if the model is used for desk hedging. State which parameters 
are different than for corporate risk management reporting, if any. 
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Quantitative Section of Model QA Documentation 


Q1. Model: Fundamental Aspects 


ОТА. Theoretical Description Summary 
e Give a short («100 word) abstract. A sensibly and reasonably 
complete description, in good English and with formulas should be 
attached. Include a categorization of model variables, parameters, 
stochastic assumptions, analytic model valuation, approximations. 
ОІВ. Domains of Applicability of Model 
e Include products or features describable by the model in its present 
version. Also, describe enhancement plans, if any. 
QIC. Limitations of Model 
e No financial model has the status of a physical law in physics. 
Describe the approximations and limitations of the model. Describe 
products, for which this model does not apply. Especially relevant are 
products that may be appropriate for a future version. 
QID. Hedging Aspects for Model 
e Specify hedging aspects of the model, if used for risk management. 
QIE. Definitions of sensitivities from the model 
e Give summary here. The attached model document should give the 
definitions of model sensitivities to each risk parameter, precisely in 
terms of measurable quantities or in terms of internal model 
parameters with reference to appropriate equations. 
e This should be supplemented with at least one clear numerical 
example, with all terms defined. 
QIF. Comparative Model Analysis and Model Consistency across the Firm. 
e When known, state how this model is consistent or inconsistent with 
models used to price the same securities in other areas of the firm. 
Q1G. Model Peer Review Description 
Q1H. Model Noteworthy Aspects (e.g. discontinuous payouts, exotic features) 


Q2. Model Parameterization 


Q2A. Market Input Parameters (rates, prices, vols, dividends, correlations,...) 
e For market parameters, specify the method used to extract the 
parameter as well as uncertainties. Specify the source for the market 
parameters, if known. 
Q2B Model Input Parameters (Strike, barrier level, expiration, etc.) 
e Specify the model input parameters. Give the acceptable numerical 
regions of the model parameters for the model. 
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Q3. Model Numerical Algorithms 


Q3A. Analytic or Semi-analytic Numerical Methodology 

e Describe in detail the relevant algorithms. Include references, 
formulas and parameters. 

Q3B. Non-Analytic Numerical Methods—Overall Description 

e Specify the numerical method (e.g. Binomial, Monte Carlo, PDE, 
Path Integral, ...). Also, give the parameter specifications relevant for 
the numerical analysis. 

Q3C. Convergence Criteria 

e Convergence criteria should preferentially be expressed in terms of 
pricing and when relevant, hedging with respect to appropriate 
variables. Give the parameters describing the accuracy and 
convergence. These can include the time-step discretization, 
including parametric dependences (e.g. monthly or SA, specific dates, 
interpolation, day count conventions). 

e Specify any flexibility to obtain convergence (e.g. changing time step 
amounts, grid enhancement). If Monte Carlo is used, give the number 
of paths if specified in running the model. For binomial convergence, 
specify odd vs. even number of time steps. Specify other parameters: 
maximum rate, minimum time step, maximum # paths allowed, etc. 


04. Model Implementation (“Black Box") 


Q4A. Requirements Documentation Description 
e Describe the requirements documentation that you have assembled. 
Q4B. Prototype Code 

e Include reference for prototyping the model, if appropriate. 

Prototypes that evolve into production code are no longer prototype. 
Q4C. General Specification (Compiled, Spreadsheet, etc.) 
Q4D. Intent (for other code, for sales, for trading, etc.) 
Q4E. Design, Architecture of Model Implementation 

e Include modules or object description at a high level. Specify the data 
structures at a high level. Specify the language for the model and 
language for the wrapper for inclusion in the system. Specify any 
other tools for model development used, if any. 

Q4F. Model Coding 

e Describe steps that were taken to ensure that “good practice coding” 
has taken place to enhance maintenance and extensibility. 

e Well-documented code means that what is going on is clear without 
having to go through the code line by line. This includes code 
readability by a person OTHER than the actual coder. Comments 
(that are up to date and accurate) should exist in addition to self- 
documenting variable names. A metric is the percentage of 
comments/total lines in the code. Each module should at least have a 
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header explaining its function and results returned. Describe the code 
documentation here. 
e List any peer code review that has taken place. 
Q4G. Spreadsheets for Models 
e If model is used as an add-in, specify here. State how changes to 
spreadsheet models are controlled (passwords, locked cells, etc.) 


Q5. Model Changes or Enhancements 
Q5A. Requirements Specifications 


QSB. Implementation of Model Changes 
О5С. Model-Change Documentation 


Q6. Model Quality Assurance "Alpha" Unit Testing 


e “Alpha QA” testing means testing by the quantitative group 
responsible for the model and by the personnel coding the model. 
Include independent pricing and hedging QA measures, checks with 
other systems, hand calculations, sanity checks, case checking, etc. 


Q7. Personnel Aspects 
e Specify the quantitative personnel responsible for the theoretical 
specification of the model, parameters, hedging properties, etc 
e Specify the personnel responsible for prototyping the model, if any 
e Specify the personnel responsible for coding the black box model 
e List the model maintenance personnel. 


Q6. Vendor Model Software 


Q8A. Vendor Model Software 
e Specify if the model is part of a vendor system, (which system?) 
Q8B. Vendor Model Quality Assurance / Evaluation 
e Specify procedures for vendor model evaluation, acceptance criteria 
for vendor models, and details for vendor model testing. 


Q9. Model Library and Software Reuse 


e Specify any documentation for the library into which the model will 
be put. Specify, when appropriate, library routines used in this model. 
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Systems Section of Model QA Documentation 


S1. Model/System Integration 


SIA. Plan or Details of Integrating Model into System 
e The model, if it is not a standalone pricer, will be incorporated into a 
system. Name the system and the version. 
S1B. Environmental Specification of System 
e Operating System (including version), live feeds specification, 
network, database including version, GUI builder, compilers, Case 
tools, version control software (e.g. SCCS), hardware compatibility, 
any parallel processing capability, etc. 
SIC. System Language 
e Тһе system language need not be the same as the model language, 
since one language can call another. 
510. System/Model Integration Specifications—Other Details 
e Include model diagnostics from system when running the model. 
Indicate diagnostics stored in the database or as files that can allow 
for interpretation of the model results, quality assurance, etc. 
SIE. “Good Practice Coding" 
e Describe steps that were taken to ensure “good practice coding”. 
SIF. Time Scale Estimation 
e Describe how time estimates are established for model incorporation 
into the system, and whether in the past these time estimates have 
been accurate. 
S1G. Maintenance 
S1H. Changes to Model: System Integration Perspective 
SII. Contacts with Quantitative Personnel for System Implementation 
e Describe the contacts with the quantitative modeling personnel in 
order to ensure accuracy of the model as implemented in the system. 
517. Model Distribution 
e Include procedures for model distribution and version control, 
according to geographical location. 
SIK. System Upgrades and Model Re-integration, if appropriate 
e Describe plans for re-incorporation of models into new or upgraded 
system versions. 
SIL. System Disaster Recovery Plan 
S1M. Systems Integration for Spreadsheets 
e Ifthe model is implemented in a spreadsheet, or a spreadsheet add-in, 
give the details of the systems integration, if any: 
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S2. Model Changes or Enhancements—Systems Aspects 


S2A. Requirements—Systems Aspects 
S2B. Implementation—Systems Aspects 


S3. Model Quality Assurance Testing—Systems Aspects 
S3A. Independent Checks with other Systems 
S3B. System Integration Testing—Procedure 
e Include description of the test suite. Describe the consistency of the 
system testing with the unit testing. 
S3C. Regression Testing Procedure when changes are made to model/system 
S3D. Feedback Procedures, Forms for Testing 
S3E. Quality Assurance Documentation—Systems integration of model 
e Include legible English commentary, screen printouts, etc. 


S4. Personnel Descriptions—Systems Aspects 
e Specify the personnel responsible for the general design, the 
integration of the model into the system, and the coding. 
e Specify personnel responsible for QA of system integration, 
including testing if model or system is changed. 
e Specify personnel responsible for maintenance of the model. 
e Describe steps taken to avoid personnel risk. 


S5. Model Library and Software Reuse 
e Describe library, if any, including resources. 
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34. Systems Risk Overview (Tech. Index 3/10) 


This chapter contains a qualitative and non-technical overview of risk with 
Systems. Before starting, it should be emphasized that each organization tries to 
produce (or buys and customizes) the best systems it can, consistent with time 
pressures and resource constraints. There are very successful systems that work 
well, and are used every day. However, the development of systems can be 
problematic. This essay deals with some reasons for these problems’. 


Advice and a Message to Non-Technical Managers 


You are an expert in your field. Still, you may have to approve or disapprove 
expensive software projects and/or expensive computer hardware purchases, but 
you have little computer background. Sometimes high-level managers approve 
what turn out to be large, badly designed, and quite expensive homegrown 
systems that lead to all kinds of friction in the organization. You will feel more 
comfortable and make better decisions if you take the time to learn something 
about computer hardware and about software development so you have a gauge 
to use when confronted with such a decision. This chapter is a quick summary of 
some of the issues. 


What are the “Three-Fives Systems Criteria”? 


Many computer programs with entirely different scopes and capabilities are 
called "systems". One definition of a system uses what I call the "Three-Fives 
Systems Criteria". Namely, the system development uses up $50MM, employs 50 
people, and takes 5 years to become useful and functional. Some people disagree 
with these criteria, arguing that the numbers should be larger. Other people use 
“system” to represent a smaller effort, and so come up with smaller numbers. 


' History: I personally wrote a mini-system for derivative pricing and risk management. 
Modules for swaps and options (including some exotics) for interest rates and other 
markets were included. Risk features included Greeks, forward ladders for 
delta/gamma/vega, and what-if scenarios that could be evaluated in future time. 
Attribution of risk to different risk factors was possible. There was a portfolio system. 
There were many screens for input/output. There were about 50,000 lines of code. I 
experienced many of the issues described in this chapter. 
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Calculators are not Systems 

By a "calculator" is meant computer code that calculates the results of models 
along with some risk measures. Calculators are much cheaper than systems. 
Sometimes people misidentify a calculator with a system. A system can have 
calculators as modules, but generally a system will also have a graphical user 
front end for input and output, connection to a database, network 
communications, be employed by many users, etc. 


What is the Fundamental Theorem of Systems Risk? 


Systems are a Black Hole of Wall Street”. 


What are Some Systems Traps and Risks? 


There are many success stories of software development. Systems development 
on Wall Street struggles against many pitfalls of systems development that are in 
fact chronic to the entire software development industry. There are four 
conflicting goals: 


Fast development 
Cheap development 
Complete development 
Reliable development 


Refinements include over-optimism, under budgeting’, redefinitions of goals 
in midstream by the end-users or by the systems group, misunderstanding, 
miscommunication, upper management incomprehension, egoism, programming 
errors, absence of software quality assurance, inappropriate design architecture, 
disparate hardware, incompatible databases, programmers getting reassigned, etc. 

Unclear language is rampant: "All right, we really need to get it done by next 
month". The phrase "get it done", which uses the dangerous pronoun "it", is 
always to some extent undefined and can change meaning. 


? Exceptions: Of course, maybe your system is different. 


? The Fudge Factor for Systems Budgeting: This applies to proposed projects that look 
big but are underestimated in cost, manpower, and time. This chronic problem can be 
partially avoided by taking the best rational estimates and multiplying them by a Fudge 
Factor FF to give the Real Answer. Based on historical experience FF = 3 is a reasonable 
estimate for each of these variables. What do you think? 


Chapter 34: Systems Issues Overview 485 


An often overlooked problem is personnel risk, where only a few people 
actually know how individual parts of the system are built, and who are "too 
busy" to document anything. This particular problem manifests itself when these 
people leave, at which time part of the systems organization may go into some 
version of crisis mode, mixed with temporary stagnation in system development. 


Specific Communication Risk 


A generic risk is the lack of effective communication between the computer 
personnel and the end-users, including quants. This is made worse by the fact that 
these different groups of people—by their training—speak different languages 
and only roughly glimpse the problems faced by the others. A good tactic 1s to 
train the quants in systems issues and to train the systems programmers in 
business and quantitative issues so that they start to speak the same language. In 
reality, the most valuable personnel do have the ability to speak technical and 
business languages fluently. However, this takes time and is often only partially 
successful. 

For PhD quants involved with models, the most common situation is that the 
quant needs to understand the mathematics, the numerical algorithms, “enough” 
of the finance, and the parameters; he/she then actually does the programming for 
the model. This model is then inserted as a black box into a system programmed 
by system programmers. This paradigm presents its own set of communication 
difficulties between the quants and the system programmers. These difficulties 
are sometimes nefarious, and must be worked out patiently to be successful. 

One potential problem is programmer turnover with the consequence of 
inefficiencies due to training and ramp-up time’. 

Another issue is communication among the various systems people 
themselves, including their hierarchy. Possibilities of miscommunication grow 
exponentially with the number of people involved. An entire layer of mid-level 
systems management is needed to cope with this problem and to interact with the 
end users. The efficiency of system development using this paradigm depends 
critically on the systems expertise of the manager along with his/her knowledge 
of the business. Sometimes this layer helps, but unfortunately, sometimes it just 
gets in the way of effective communication between the end user and the 
programmer. 

For an enterprise risk system, communication becomes a key issue. 
Individual silo systems that already exist are typically based on different products 
and/or different functionalities. One system may not “talk” to another system, 
and the people in charge of these systems may not engage in much 
communication between themselves either - they are up to their ears in work, 


^ System Personnel Risk: Here is one trap. The systems management might have the 
naive attitude that programmers are functionally identical, and therefore programmers can 
be put into a pool and switched around between projects without degrading anything. 
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after all. On the other hand an enterprise system using the functionality of 
departmental systems depends on enterprise-wide communication. The 
alternative is that the enterprise risk system has to duplicate appropriate 
functionality of departmental systems. 


The Birth and Development of a System 


It is instructive to see the different perspectives regarding how systems begin and 
develop. Systems developers often (and justifiably from their perspective) want 
"specs" written by the end users that completely and accurately describe the 
requirements for the system soup to nuts. Once the systems group establishes the 
specs, they establish "milestones" along a well-defined path to the final delivered 
"solution" product. The whole thing is monitored using software’. 

The end users generally have very little time for these "extra" activities. First, 
they usually have no idea what the computer people mean by the word "spec", 
and only reluctantly participate. In fact, end users often cannot describe the 
requirements accurately. This does not necessarily imply a deficiency or 
inattention on the part of the end user, because planning a system is extremely 
complex. Compounding the problem, different users, who appear at different 
times, can have different requirements. Further, the requirements of a given user 
(justifiably from his/her point of view) can change with time. Requirements may 
change for good reasons connected with optimizing the business, or because of 
an enhancement that for some reason only becomes envisioned after the 
development is underway. 

Often, thinking through a complete system is impossible because the human 
brain simply cannot work through the logic. That means that writing a definitive 
once-and-for-all spec is sometimes literally impossible. 

Development difficulties are ameliorated if the system is written in a modular 
or object-oriented fashion, but even in this case there can be real planning and 
execution problems for large systems. 

Once a system gets to a certain stage, the system is often difficult to change. 
Systems programmers are usually reluctant to accept the changes because they 
compromise the system milestones, and they get annoyed at the end users for not 
putting in these items at the beginning. 

The problems are further exacerbated because end users rarely have any 
accurate concept of the difficulties of the programming, and do not understand 
"Why It Is Taking So опо", 


5 Gantt and Pert Charts: These utilities can be useful for tracking systems progress, but 
they may be difficult to formulate and update, and they can even become a distraction. 
My reaction to a systems manager who shows up with one is usually “Whoopee”. 


* Programming Difficulties: You will get a much better sense of the problems if you try 
programming something yourself. Come on, its not Beneath your Dignity. 
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Programming Difficulties 


Programmers have to cope with a myriad of extremely difficult issues. 
Programming is both an art and a science. Extreme concentration, exceptional 
discipline, difficult training, and high mental ability are prerequisites for a good 
programmer. Moreover, programming generally requires communication and co- 
ordination with others working on the same system. The best programmers are 
worth their weight in gold. 

Programmers, on the other hand, sometimes do not have enough financial 
background to work as efficiently as they might, or to know when a result 
coming out of the program makes sense or is manifest garbage. 

A common programming problem exists in modifying or interacting with 
large systems that others have written, and which the current programmers only 
understand incompletely. This can be especially true for programmers writing 
custom code to insert into a large system, raising up consistency issues. 


Prototyping 

In practice, prototyping is often used. A quick prototype is constructed that is 
intended to be a "trial balloon" that then will be reprogrammed once the end users 
sign off. The end-user starts to use the prototype. By definition, a prototype does 
not have full functionality and the end-user wants to see more, quickly, without 
waiting for the rewrite. An initially unintended consequence is that the prototype 
steadily grows larger, and because of technical problems due to shortcuts, 
becomes incapable of the flexibility required to meet growing business needs. 
Many prototypes become ensnared in this global attractor to become the Final 
System, warts and all. 

A common problem is that the prototype, a small endeavor, starts to grow and 
eat up resources and money on the long rocky road to becoming a True System. 
Often, management does not recognize the scope or uncertainty of the required 
resources, time, and money. 


Who Controls the Systems? 


Because a system is a complex product, it is often constructed in an independent 
Information Technology IT or Systems Technology Organization STO or 
“R&D”... The IT department can implement a rigorous systems environment that 
is desirable from many points of view, including regulatory aspects. 

Alternatively, a decision by the business units to control the system process 
locally in the departments can lead to better flexibility and communication, but 
this can also lead to shortsighted technical decisions for a variety of reasons. 

Not the least problem is related to the boundless ego of some traders who 
believe they can infallibly direct the construction of the World's Best System. 

Structurally, isolated departmental efforts can become problematic. 
Centralized corporate requirements (for example centralized risk management 
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reporting) can become a gigantic exercise in co-coordinating information from 
iconoclastic isolated departmental systems. The departmental business units, 
while co-operating, can be less than enthusiastic about this expensive activity. 

A successful enterprise-level system needs explicit support from management 
at the top, communicated down to department levels. 


Systems Risk in Mergers and Startups 


In recent years, with mergers becoming a staple of life, a variety of difficulties 
related to retrofitting old systems or building communication layers between 
existing systems from the different companies in the merger becomes a large and 
unwieldy enterprise. The temptation to "scratch it all and build it over right this 
time" is seductive. Sometimes this is a viable approach, but can lead back to the 
Black Hole Systems Theorem. 

Startups present an entirely different picture, since there is a blank slate, 
which does not constrain development. Unfortunately, if management that 
controls startups is inexperienced regarding systems development, potential 
systems pitfalls can become reality. 


Vendor Systems Risk 


One alternative, often adopted, is to use vendor systems in which development 
costs are divided among the vendor's customers. Vendor products have real 
advantages in some cases, especially with startups. A vendor system is "turnkey", 
so that long, expensive initial in-house system development is avoided. Some 
vendor systems for derivatives, for example, have hundreds of man-years of 
development’. While the cost for a vendor system can seem large, it can be much 
more economical than startup in-house development from scratch. Vendors 
develop their systems to appeal to market participants. Some vendors have clients 
who are broker dealers and banks and they get valuable feedback from these 
users. Sometimes alliances are formed for mutual development projects. 

Some problems with vendor systems can surface. For example, once the 
honeymoon period is over after the contract is signed, the service can drop in 
quality. Communication difficulties with the vendor bureaucracy can lead to the 
vendor product not being delivered completely in line with what the client wants. 

Still, all in, a vendor solution is one that is often successful in fulfilling the 
needs of the client in a reliable and perfectly reasonable manner. 


7 Man-Years and Risk: For example, 100 man-years can mean 20 programmers working 
for 5 years. The major expense and long time, along with many difficulties that need to be 
overcome, are the risks that the vendor assumes. The clients share the development costs 
and are largely insulated from the risks. 
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Evaluating Vendor Systems 


In order to evaluate vendor systems, a concerted effort involving quants, systems, 
trading, back office, etc. must be undertaken. A good deal of time and effort is 
required to examine and compare thoroughly the systems of the various vendors. 
An in-house vendor review document should be systematically constructed for 
evaluation. 


In-House Developers and Vendor Systems 


In-house developers usually oppose vendor systems. There is a logical basis for 
this attitude, since the in-house system code is more transparent, and the in-house 
system will generally contain the customization desired with good feedback. 

Vendor systems can be simple or problematic for customization. Most 
systems allow proprietary models to be inserted, and customized risk reports can 
be generated. However, dealing with interfaces to systems, which are essentially 
black boxes to the customer, in some cases can be difficult and clumsy in 
practice. Moreover, models written by the local quants that work fine in 
standalone mode can be negatively impacted inside a system. Bug fixes to 
customized code can be hard to incorporate. In other cases, the interfaces are well 
designed, and adding models is quite simple. It is a good idea to test-drive the 
capability of the system before buying, in order to see just how simple or difficult 
it is in practice to add a model to the system. 


Do you REALLY want the Source Code? Maybe Think Again. 


The possibility of buying the vendor source code is sometimes seductive to 
ambitious in-house programmers. The advantage of owning the source code is 
that the in-house programmers have complete control and transparency, and can 
modify the source code for customization. 

Source code however is a double-edged sword. Upgrades of the vendor 
software can require a lengthy and painful porting exercise for the customized 
code. Moreover, the in-house programmers need to spend substantial time 
initially understanding the vendor system, which can be huge (the scale can be a 
million lines of code or more). Compatibility problems between the in-house 
code and the vendor code can surface. 

A red herring that has been used to justify source code purchase is that the 
human mind can only process a few things simultaneously. The argument is that 
therefore hooking up many models to a vendor system without source code is 
problematic. However, models can be hooked up independently. Moreover, the 
argument can backfire since the complexity of the source code can be a trap. 
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New Paradigms in Systems and Parallel Processing 


New paradigms in systems are important. While it is important to be on the 
“leading edge”, it is desirable to avoid the “bleeding edge". The most significant 
potential development is the Internet. Except for transmitting and displaying 
results, e-mail etc., quantitative development today mostly remains in more 
traditional areas. The biggest potentially useful paradigm in systems is parallel 
processing. The perennial increase in hardware power has meant that simple 
networked sets of workstations can be used in many cases for co-coordinated 
computing. Parallel processing is sometimes used, but could be exploited more 
than it is now. We discuss parallel processing in the next chapter. 


Languages for Models: Fortran 90, С”, C, Python, Others 


Computer languages are unfortunately often viewed emotionally a bit like 
religion". The programmer skill and the style of programming are, however, 
paramount. Good or bad code can be (and has been) written in any language. 
Coding quality and style can be clean or cryptic. 

Model code is a separate issue from system development. Models tend to be 
structured as technically difficult mathematically intense modules written by the 
quants, which connect to a larger system for input and output. 


Do We Really Need to Write in C^ for Model Code? 


Systems now tend to be written in C''. Models may be, but do not have to be, 
written in C^. Because model coding is extremely difficult, it is most efficient 
for a quant to write model code in his/her most comfortable language (C, С”, or 
Fortran), instead of being forced into using C™ (which may be an unfamiliar 
language). Writing in the language of highest proficiency for the quant saves time 
and reduces bugs. 

Note that there is no “Tower of Babel” since models written in one language 
can be readily interfaced with systems written in another language. 


Fortran (90) 


Fortran was, and maybe still is, the most common language used for 
science/engineering codes, and used to be the first language of many quants. 
High-end supercomputing is often done with Fortran, which partially due to its 


* Acknowledgement: І thank Gregg Rapaport, a super back-office guy, for this quote. 


? Viewpoint: This section is written from a contrarian, perhaps heretical, viewpoint. 
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superior compiler technology, remains the fastest and perhaps the most 
convenient language for numerical computing "°. 

Fortran is badly misunderstood, even by systems gurus. Many do not know 
that Fortran was transformed into a modern language (Fortran 90), and is being 
further improved ! Fortran 90 contains many features of C, C", and APL ы. 

We will meet Fortran again in Ch. 53 on Climate Change Risk Management. 
Large climate model codes are often written in Fortran. 


С 


С used to be routinely used for numerical model code. In addition, entire systems 
were sometimes written in C. However, extra care has to be taken since the C 
language can be unpleasantly problematic, as I found out from experience "^ . 


A Cool Fortran vs. С Speed Story: The president of a large С'* derivatives vendor 
software company was once in my office where I gave him a demo of my derivatives 
model prototype system. He was impressed by the speed, and asked in what language the 
software was written. When he heard “Fortran”, he said “Wow, that's why it’s so fast". 


П Fortran 90 Is Modernized, Containing Most Features of C, C^, and APL: Today, 
Fortran 90 has object-oriented features including modules, encapsulated data and 
procedures, private and public attributes, general structures, and operator overloading. 
Many APL-like parallel array or vector functions are provided that are convenient and 
useful for scientific programming. This means a one-line statement (e.g. A = 0.) can 
apply to an entire matrix. 

Procedures (subroutines, functions) now have call by value in addition to call by 
reference, optional arguments, recursive procedures, and arbitrary argument order. 
Pointers exist and have an explicit target attribute for efficient and controlled usage, 
dynamic memory allocation, and pointer operations. Character/string enhancements 
include new operators. Execution constructs include case, cycle, and do while. 

Other improvements include enhanced read/write I/O features, free-format code, 31 
character names, bit manipulation, explicit obligatory typing, new random number 
generator features, and arbitrary array indexing. Compatibility with Fortran 77 exists, so 
code migration can take place deliberately, compatible with numerical libraries. Fortran 
can be interfaced with Excel. There are compatible screen-drawing programs. The 
Fortran language remains easy to read. Check it out. 


? Fortran Used Less Now: Since the 1“ edition, Fortran is rarely used. My opinion 
remains unchanged. 


В Bad News C-Code Bugs, Cryptic C Code, Heisenbugs, and a C Story: Destructive, 
time-wasting episodes with bugs in C-code written by skilled quants and programmers in 
my quant groups occurred over the years. These included compiler incompatibilities, 
memory leaks, and pointer problems. Cryptic code can be problematic with C - just try 
reading someone else's C code (and C++ is no better). The “Heisenbug”, brought to my 
attention by Jon Hill, is a mysterious C bug that shows up only when the program is 
unfortunately “disturbed” by the computer environment. Here’s the C story: The 
programmers at one derivatives shop refused to use C at all because they considered the 
language “too dangerous”. While that is perhaps an extreme viewpoint, I heard it with my 
own ears. 
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Python 


Python recently has made inroads for prototyping calculations". Readability is 
one of its primary attributes. 


Other Prototyping 


Commercial packages (mostly Matlab and also Mathematica) are used profitably 
for prototype modeling. 

Java is useful for simple models and Internet apps, but the language has not 
made inroads into industrial-strength model code. Visual Basic is popular, and is 
enthusiastically promoted and used for model prototypes by some. 


The Most Common Prototype Platform 

The most common prototyping method remains spreadsheets, usually Exce 
Spreadsheets can function as front ends, with add-in functions to perform the 
actual model calculations. 


i". 


What's the *Systems Solution"? 


Systems-speak uses the word "solution" to characterize a system. Actually, there 
is no really good solution. Human beings are analog animals. No computer or 
robot can reproduce the capabilities of human vision or touch. On the other hand, 
human evolution has not required the precise logical thinking of the type required 
for digital computer programming. Programming often consists of a giant logical 
exercise with severe consistency and interdependence difficulties that are only 
partially resolved by modular or object-oriented design". 


Are Software Development Risks Unique to Wall Street? 


No. No. No. 

It may be (but probably is not) a consolation to understand that the problems 
of system development faced by Wall Street are endemic across the entire 
software industry, where system projects are often incomplete, over budget, 


^ Spreadsheet Story: I have written hundreds of Excel spreadsheets (color coded - cf 
previous chapter). You can tell how long someone has been around by asking if they ever 
programmed a Lotus spreadsheet. I once did, for FRAs and bond future options. 


P Consistency Issues in Programming: You will understand the seriousness of the 
consistency problems if you have tried programming a prototype system, or have 
witnessed expert programmers grappling with an industrial-strength system. 
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buggy, and late. Computer technical journals are replete with articles on the 
subject. Perhaps the reader has had some direct experience with the problem. 

Books have been written about "best practices" in programming, and there 
are metrics (for example the Capability Maturity Model "), which have been 
devised to measure systems quality. In practice, such good advice is followed 
only to some extent, mainly because “there is no time" апа “it is too hard". 
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35. Strategic Computing (Tech. Index 3/10) 


This chapter is concerned with an overview of strategic directions for numerical 
financial computing" '. Parallel processing will be emphasized. The need is for 
rapid and cost-effective valuation of large portfolios of options, mortgage-backed 
securities, bonds, etc. in order to enhance competitiveness in trading, risk 
management, and sales. A description is given of the utility of parallel-processing 
computers, distributed workstation environments, and technological advances 
pertaining to new directions in financial numerical computing’. 

The material in this chapter 1s self-contained, and can be read separately from 
any other material in this book. 


' History and Stories. This chapter is based on my 1989 preprint (ref i); the principles 
therein are still valid. The history started in 1987-89 at Merrill Lynch. At that time, as 
anyone over a certain age will remember, mainframes dominated, Unix workstations 
were just beginning, and PC's were not yet competitive. Following the lead of Prof. 
Harold Shapiro visiting from the Courant Institute and who had set up a SUN 
workstation, I started a project called "cost-effective computing". The idea was that 
quants could contribute to the firm's bottom line through innovative thinking about 
computing. The process included a systematic evaluation of vendor workstations, holding 
meetings, giving presentations, and trying to convince the management that workstations 
would be an effective paradigm. This led to the first workstation network at the firm for 
quantitative work, one of the first on the Street. The whole process was rather 
exhilarating. 


I then started a serious investigation of parallel-processing machines. At the time, 
Intel had an initiative for high-end parallel financial computing. A parallel machine was 
eventually adopted at the firm for CMO mortgage calculations. 


The experience was at times dangerous. One war story will suffice. A meeting in 
1988 was commanded by the Heads of the firm's Systems Group (read Mainframes), all 
sitting at one end of a huge polished wood table. These clearly annoyed officials wanted 
to know why workstations were not a waste of money. My manager up two layers 
deflected the attack in a masterful fashion by characterizing workstations as "just big 
calculators". The dinosaurs slipped on this banana peel and the project continued. 

Much later, I was having coffee in the World Financial Center. A manager in the 
firm's systems group (not one of the above) came over and said “You changed the way 
we thought about computing". I was proud of that. 


? Acknowledgements: I thank Jon Hill for many informative discussions, for carrying 
out many calculations in my groups, and for some helpful comments added to Ch. 34. I 
also thank Mike Driscoll for his enthusiastic help at Merrill with the workstation project. 
Finally, I thank all the systems programmers and managers over the years that contributed 
to the success of the quantitative efforts with which I was involved. 
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Introduction and Background 


Numerical financial computing includes a number of important issues. Banks, 
broker-dealers, insurance companies etc. must deal with calculating the worth of 
large portfolios of financial instruments. These include mortgage-backed 
securities, corporate bonds, options, complex structured products, etc. Often the 
securities contained in these portfolios are not traded on the open market. For 
risk-management and portfolio total-return scenario analysis in the presence of 
underlying variable (e.g. interest-rate) movements, these securities need to be re- 
evaluated many times using models. Trading and sales activities also require 
securities pricing using models, preferably in real time. 

Financial securities often contain embedded options that the holder or issuer 
of the security may exercise, and the options models are often sophisticated and 
numerically intensive. Typically, this involves diffusion with complex boundary 
conditions. The valuation is carried out using analytic approximations, Monte- 
Carlo simulations, PDE algorithms, etc. 

For optimal financial strategies, these complex models need to run fast for 
large numbers of securities and, for a robust evaluation, under a selection of 
possible economic or risk stress scenarios. Moreover, all this must be done at a 
reasonable cost. 

An extremely promising direction for some time has been parallel 
processing’. Many numerically-intensive financial applications parallelize easily. 
These include portfolio calculations involving models. More generally, 
applications including many repetitive calculations are natural candidates for 
parallel processing. The most natural application is Monte Carlo simulation. 
There are other uses for parallel architecture, including large database 
applications. Feature recognition and comparisons of different time series with 
each other for arbitrage trading purposes can parallelize, including real-time feed 
applications. Optimization applications are more complicated. 


Illustration of Parallel Processing for Finance 


To illustrate parallel financial computing, here is an example. Imagine that an 
analyst or risk manager wants a portfolio of securities to be run each day with a 
number of different yield curves as scenarios for risk analysis. A parallel 
platform can be constructed to do these calculations. The basic idea is simply to 
choose a platform with several “nodes”, each node capable of performing a 
calculation of a security. The same architecture can be used for a Monte-Carlo 
simulation of portfolio risk, mortgage models, etc. Each node is independently 
capable of performing calculations for one security. The analyst interacts with the 


? Parallel Processing Now Commonplace: Since the 1“ edition when the above was 
written, parallel processing has become commonplace, mostly using networks of 
computers. 
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nodes through a host machine. The host machine sends information to the nodes, 
receives results for display, produces reports, etc. 
The picture below gives the idea: 


ANALYST 


Some Aspects of Parallel Processing 


Parallel Processing Computers 

The explosion in the technology of microprocessor chips is the driving force 
behind the viability of distributed and parallel computing" as feasible solutions to 
increasing computing requirements. Advances in solid-state physics and 
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engineering, and in chip design, have contributed to this development". This 
revolution is by no means over, and developments will dramatically further 
increase the power of microprocessors. Since the chips are cheap, the 
price/performance of such systems 1s extremely favorable. Further advances in 
chip technology and networking implies that the situation will continue to 
improve. Decreasing the transistor size on the chips is challenging, involving x- 
ray lithography and advanced heat-sink technology. 

A parallel distributed computing environment composed of advanced chips 
and a network can be packaged as a separate board, set of boards, or parallel 
processing computer. The networking can be made efficient for such parallel 
machines. | 

Supercomputers now invariably have parallel architectures ^". 


Distributed Parallel Workstation Networks 


A workstation network can itself be used to perform parallel computing because 
one workstation can call on other available workstations in the network to 
perform simultaneous computations. Issues to be addressed in comparing such a 
style of parallel computation include the availability (or unavailability) of enough 
workstations at appropriate times to perform the required calculations, the 
capacity of the network connecting the workstations together, and the relative 
cost/performance of adding another workstation to the network. Workstation 
networks are naturally cheap, because they use existing machines. ^ A new twist 
on the idea is to use many machines run over the Internet. 


Nodes 


Central to the discussion of parallel processing is the concept of a "node". A 
medium-sized node is comparable to a workstation, complete with local memory, 
and possibly I/O [Input/Output] capability, but not necessarily a monitor. N of 
these nodes are then coupled together in a network. The number (e.g. N = 2, 4, 

.., 128, ..) depends on the type of node and the overall cost. Especially 
important is the network-management software. The network is then attached to 
the host. 


^ State of the Art Parallel Supercomputer: The biggest supercomputers have been 
parallel machines for a long time. Records are constantly broken with tens of thousands 
of processors and (in 2013) multi-petaflop speeds. Here FLOP means FLoating-point 
OPeration and a petaflop is 10^15 flops/second. Typically a mathematical software 
LINPACK benchmark is used. 


5 Drawback: A drawback in practice can be that runs using many machines have 
interruptions due to system administrator actions that are not coordinated with the run 
schedule. Also, the time delay for calculations in an ordinary network is greater than for a 
customized back-plane network that couples processors together in a parallel machine. 
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The nodes can be smaller or larger. A few mainframes can be networked 
together in a unified system; here the (huge) node is a mainframe. At the other 
end is a computer with thousands of very small nodes. The size of the node is 
referred to as the "granularity" of the machine. The choice of the parallel 
hardware, and in particular the granularity, depends critically on the nature of the 
applications. The basic points are that one node must be able to handle an 
independent calculation of interest, and that node power should be used 
optimally. This means that a large and powerful node should not be used to 
perform many repetitive small calculations, and a small node should not be called 
upon to perform large calculations. For many financial applications involving 
models, a medium-grained node machine is probably optimal. 


Trivial or Linear Parallelization 


The parallel processing architecture is most appropriate for those applications 
where individual calculations are independent of each other (such applications 
are said to "linearly” or “trivially” parallelize). Thus, all nodes can carry out these 
individual calculations at the same time. The results are transmitted to the host. In 
this linear-parallelizing case, the conversion of software to take advantage of 
parallel processing can be simple. Parallel processing can be applied to 
applications requiring inter-node communication, but the coding is naturally 
more complicated. A large class of financial applications linearly parallelizes. 


Vectorization and Parallelization 


It should be noted that "parallelization" and "vectorization" are not the same. 
Vectorization means that calculations are arranged if possible in a mode 
involving mathematical vectors; a typical example is a do-loop, if successive trips 
through the loop are independent of each other. Vector hardware processes these 
calculations in a pipeline or assembly-line fashion’, with on the order of one 
operation performed every clock cycle. Parallel hardware, on the other hand, can 
perform many operations every clock cycle since there are many clocks, one for 
each processor (though these clocks may run slower). The relative efficiency of 
vector and parallel hardware depends on the application. Actually, vectorization 
and parallelization are complementary ideas. An advanced architecture combines 
vectorizing hardware in parallel, conceptually consisting of parallel copies of 
assembly lines. 


Parallel Languages and High Performance Fortran 


Software products can aid in the identification of portions of existing code that 
can parallelize. In particular, some languages have been extended with 
parallelization. The best example is High Performance Fortran or HPF." HPF is 
an extension of the new Fortran 90 standards, and is often used for high-end 
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scientific computationally-intensive applications. The idea is that the compiler 
sets up scheduling of parallel operations that exist in the code. 


Parallel Communication and the MPI 


The MPI (“Message-Passing Interface”) is a robust standard for communication 
between different processors to implement parallelized calculations". MPI was 
based on work at IBM, Intel and other companies that developed message- 
passing languages. MPI is designed to be an efficient and portable 
communication interface for Fortran and C code. MPI can be used with MIMD 
(Multiple Instruction, Multiple Data) programs. Communicating computers have 
their own local memory and send messages to each other to co-ordinate tasks. 
MPI-3 is the latest version for high performance computing. Software called 
PVM (“Parallel Virtual Machine”) can be used in conjunction with MPI. 


Parallel Computing and Data 


The existence of large databases and the need to maintain security, integrity, and 
cohesiveness is a challenging issue in distributed or parallel computing. The 
distribution of the data (the “load”) between workstations, or between chips in a 
parallel computer, has to be determined on a case-by-case basis. 

PCs or workstations individually do not have the flexibility and power of 
mainframes or large servers for handling large amounts of data. A common 
solution is a parallel network attached to a large server that handles the large 
databases. 


Technology, Strategy and Change 


New computing paradigms are ignored by a financial institution at the peril of 
losing the leading edge. At the same time prudence and reliability, along with the 
existing investments in software and hardware must temper implementation of 
new technologies. Still, investment in an installed computing base should not be 
allowed to serve as an absolute inertial barrier to the examination and deployment 
of new powerful and cost-effective technology beneficial to long-term interests. 
Strategic decisions involve change; change is not without disruption, but the 
alternative may be stagnation. 


Systems Groups, End Users, and New Technology 


A proactive role of end-users with the corporate computer Technology or 
Systems Groups to implement new technologies in an optimal fashion is 
desirable. For maximum synergy in dealing with the difficult issues and decisions 
involved, the end-users have to understand systems issues and the systems groups 
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have to understand end-user requirements. Unfortunately, the reality is generally 
that these two tribes understand each other’s language at best imperfectly. 
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36. Data Risk: Qualitative, SSA, Generalized z- 
Score Polynomials (Tech. Index 5/10) 


A complex and knotty problem faced by financial risk management at the 
corporate level pertains to obtaining consistent, reliable, and complete financial 
data’. Data problems produce sources of uncertainty for risk management’. 
Specific issues with data are discussed in other chapters in this book. Here, we 
deal with some overall issues in a qualitative fashion. 

We present two useful methods for dealing with data: Singular Spectrum 
Analysis (SSA) with its Multivariate-SSA (MSSA) generalization, and General 
Measure Orthonormal Polynomials, generalizing the ubiquitous z-score. These 
techniques do not seem to be well known but I believe have general significance 
for analysis in finance. The Kalman Filter (Bayesian analysis) using SSA is 
introduced, and a brief analysis of co-integration using SSA is made. 


Important Qualitative Aspect of Data Risk 


Data Consistency Risk 


Because of resource limitations and historical development, data are typically 
fragmented across the organization in departmental databases. The firm-wide 
"back-office" or "books and records" databases contain the firm's official 
positions and their values, which are used for reporting purposes. These data are 


' What is the Fundamental Theorem of Data Risk? As mentioned several times in the 
book, this apparently tongue-in-cheek but actually serious statement says: “Data 
constitute another Black Hole of Finance”. Some people think that this Fundamental Data 
Theorem is more profound than the Fundamental Theorem of Systems Risk (cf. Ch. 34). 


? Data Sentence Practice for the Connoisseur and A Point of Grammar: Here are 
some handy data sentences to practice, all of which can and have been used: “I don’t 
believe your data", “My data are better than your data", “You’re using the wrong data 
series”, “The data don't imply what you're claiming", and “My judgment is more 
important than your data". Others: “We don't have the data", “The data feed broke 
again", and “We’ll have access to the data next month", which can be used periodically. 

As an annoying point of grammar, the word "data" is actually a plural word; the 
singular from Latin is "datum". Thus, it is improper to say: "this data stinks"; the correct 
version would be: "these data stink". 
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typically obtained from "feeds" from the "front-office" trading databases. The 
traders use their departmental databases to calculate their risks. 

The departmental risks need to be aggregated at a corporate level. Corporate 
risk management likes to use official data. Unfortunately, a back-office database 
often does not contain critical information needed for risk management. This 
leads to a number of serious problems. Either a firm-wide database is constructed 
with the needed risk information, or feeds containing required risk information 
are brought in from the front office. Neither solution is very good. Consistency 
problems can arise between databases if a separate corporate database is 
constructed. On the other hand, front-office feeds must be ironed out to avoid 
misunderstanding. Front-office conventions for data representation are sometimes 
unclear, often undocumented in writing, and may change without warning. 
Moreover, the tolerance of the traders for data work done for corporate risk 
management is limited, because this activity does not make any money for them. 


Data Reliability Risk 


Reliability of the data in any database is a serious potential issue, requiring 
dedicated personnel that understand the database structure and as well as the 
financial content of the data. This optimal deployment of personnel is often not 
present or inadequate. Data errors (sometimes from human input error, 
sometimes from feed error, sometimes from lack of input at all) exist. This makes 
it difficult sometimes to distinguish a true outlier event from a simple error. 


Data Completeness Risk 


Completeness of data is a serious issue. If the front office does not trade in a 
particular sector, traders will generally see no reason to put data corresponding to 
that sector in the database. Then later, if trading does occur in that sector, data 
available for historical comparison of risk will be limited for that sector. This is 
an issue, because misleading results can be obtained by performing risk analysis 
using limited data. In particular, if data only go back during periods when the 
markets were relatively stable, the potential market stresses from turbulent 
periods will not be realistically evaluated. This is true no matter what statistical 
tests or which high confidence levels are used. 


Data Vendors Risk 


Data vendors exist and are often used in absence of, in place of, or to 
complement, in-house information. However, inconsistencies sometimes exist 
between data from different vendors. Moreover, the algorithms used by the 
vendors to construct their data time series from their market sources are 
sometimes not available. This can lead to problems since one algorithm or set of 
sources can be used for one time series of data, and another algorithm or set of 
sources for another time series. 
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Still, it can be said that the data vendors live or die by their product, and they 
do their best to cope with an inherently extremely difficult problem. 


Historical Data Problems and Data Groups 


It takes a long time to discuss problems with historical data. These include 
distinguishing outliers from bad data points, coping with missing data points, 
inconsistent measurement frequencies for different time series, effects of 
overlapping windows needed when not enough data are present, inconsistent data 
from different sources, strings of zeros entered by traders who do not happen to 
have positions in that variable at the moment, etc. 

This highlights the need for a separate group that is dedicated to dealing with 
these perhaps boring but critically important and thorny data issues. A data 
quality group is not the same as the group of systems programmers that handle 
the representation of the data in the computer using a database. It is sometimes 
neither easy to get management to understand these issues nor to obtain the 
appropriate resources to deal with the problems adequately. 


Preparation of the Data 


Even given pristine data, we must still be concerned about preparation of the data 
as input to analysis. Preparation can include smoothing techniques (various 
choices of moving averages or centered moving averages; splines, Padé 
approximants), weighting techniques emphasizing recent data or data from a 
particular time period which might be considered more relevant, possibly putting 
caps restricting large moves which happened in the past but “can never happen 
again", etc. The definitions of the variables to use in the analysis need to be 
specified; these can be various functions of the underlying time series (e.g. 
returns, simple differences, or other). 


Bad Data Points and Other Data Traps 


Distinguishing bad data points from valid outliers can require an in-depth 
knowledge of the market. The presence of typos or zeros in the series is an 
obvious problem, but there are more subtle issues of various types of data traps. 

For example, bad data can occur for a spread constructed by subtracting two 
time series that are individually reasonable but constructed with inconsistent 
methodologies (probably undocumented). It is moreover common that proxy 
spread index data are used for a particular spread, either because the data do not 
exist or because there are too many spreads for the risk model (e.g. a VAR 
simulator) to handle explicitly. The outliers for the proxy data may not be outliers 
for the spread — or conversely. Naturally there can be lively discussions regarding 
the appropriateness or inappropriateness of the proxy index used — or why the 
simulator is not using more indices. 
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We turn next to some mathematical techniques for handling data that are not 
commonly known in finance. 


NEW TOPIC: SSA, MSSA, and Data Smoothing/Cleaning 


In studying climate change (cf. Ch. 53), I became aware that geophysicists use a 
technique called Singular Spectrum Analysis ' or SSA to deal with data in various 
ways. I believe that SSA and its generalization MSSA (Multivariate SSA) will 
prove very useful; the technique does not seem to be generally known in finance’. 
Here are some examples: 


1. Filling missing data points in financial time series", modifying a method 
used by geophysicists for climate data. 

2. Defining the Macro trends for the Macro-Micro model for real-world 
potential future exposure (PFE) simulations (Ch. 31). 

3. Defining noise-reduced stable correlations for long-term counterparty 
risk simulations (Ch. 37). 

4. Keeling CO; data analysis removing noise, and short-term forecasting 
(see the Bloomberg Carbon Clock ) " 


Basic Formalism of SSA 
The basic idea of SSA is an expansion of a time series X(t) with respect to 


eigenvectors of the autocorrelation matrix of the time series itself. That is, the 
series is expanded using information from the same series at different points in 
time. This is useful to isolate various aspects of the series, from trends to 
oscillations that have inherent memory. In a sense SSA produces a clever 
"moving average" with sophisticated weights. 

I next review the SSA formalism briefly*. Set f =1...N and define the “lag 
covariance matrix" C, composed of autocorrelations of the time series X(t), 


with time lags k —1...M . The eigenvectors {E 1} of C, form a complete set. 
The К" eigenvector E , has components {Е M ( j)} with j =1...M . Then: 


M 
=> Кї) (36.1) 


К=1 


* Acknowledgements: I thank Adam Litke, Harvey Stein, Nora Omarova, Mario 
Bondioli, Yan Zhang, and Xipei Yang for discussions on SSA and MSSA. 


* Rosetta Stone for SSA Notations: Translation between the upper and lower half of 
Wikipedia page: (X,L,K,N) for lower page = (D',M,N’,N) for upper page; N’=N-M+1. 
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Here the “reconstruction component” R,(t) using the k” eigenvector is? 


R(E У х j)5 DE) 862) 


jJ 


The identity keeping all terms can be seen by combining the above two 
equations, interchanging the k sum with the 7, /' sums, and using the fact that 
the eigenvectors of the lag-covariance matrix are complete, viz: 


M 
Y E, (P) E, C) =0 (36.3) 
k=1 


An approximation x) (t ) to X (t) retains only Ж terms, viz 


Х(ї) = x (n 3:30 (36.4) 


This is the essential point. The terms retained are designed for specific purposes. 


Basic Formalism of MSSA 
Multivariate-SSA (MSSA) generalizes SSA for different time series [X (t )} 


a 
with @ = 1...4, treated together. The MSSA expansion of one of the time series 
X, (t) has 3 effects, making the method very powerful: 
1. Time lags through autocorrelations of X, (t) (like SSA) 
2. Ordinary cross-correlation between two series X, (t). X, a(t) at equal 
times (like principal component analysis PCA) 
3. Lagged cross-correlations between two series X, (t). X, s(t’) at different 


times / = t'(in neither SSA nor PCA) 


5 Complications at the end of the time series: A “padded” version of SSA involves 
lower and upper limits that are t-dependent. We show the simplest “unpadded” version. 
Corrections to the reconstruction formula need to be made near boundaries (e.g. 
today, at the end of time series), within M time periods. One procedure adds forecast 
assumptions; this is also the procedure for using SSA for forecasting. 
I thank Xipei Yang for discussions on this and many other topics. 
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An eigenvector E of the lag-covariance matrix with all time series has 


components UE ka ( j las м" The eigenvectors satisfy the completeness relation 


M 
УЕ,,(7)Е,,(7)= 459 (36.5) 


k=1 


The approximate reconstruction equation for X, (t) including K eigenvectors is 


x, (e 0) Xa Хе +у— DE Eal) 666) 


Note that the same term 3 = 0 appears on the right hand side of Eq. (36.6). 
This leads to a bootstrap procedure. A trial value X(t) for X, (t) is inserted 


on the right hand side of Eq. (36.6), a modified value A e is recovered 


from the left hand side of Eq. (36.6), and the process is repeated. This idea is 
behind the MSSA techniques for filling missing data points in time series. 


Kalman Filter (Bayesian Analysis) using SSA 


A Kalman filter (Bayesian analysis) merges theory and observation to obtain a 
better estimate than either. Kalman filters” are widely used in engineering, 
economics, etc. Here we apply SSA°. Consider variable X(t) at time £. Eq. 


(36.6) provides a theoretical SSA forecast А, (f) using previous observations 
U < t} , starting from an initial assumed forecast value X ү, 
Now assume an observation X,,,(¢) occurs. The observation А, (t) has 


error Opp, and the SSA theory X,,(f) has error с,,, which must be supplied 


(e.g. from history). Assume for simplicity that the observations are uncorrelated 
with the model. This can be relaxed if the correlations are known (see Maybeck). 


The Kalman filter X p(t) for X(t) is a weighted average of the observation 
Xop, (t) and the theory Xp, (t), viz 


X(t) = wX, (t) € 0— w)X,, (0) (36.7) 


° Acknowledgements: I thank Yan Zhang and Zhaoou Yu for discussions. 
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2 
А 2 _/y2 \ Se de, ИШЕ Ут, №52 
The variance 0%, - 6.) PM is 0,, 2 Wo, t(l- w) o5. The 


2 2 
weight w is given by w=—S* and 1— у = —*-. The formula for 0. is the 
On, O obs 


same as for the total resistance of two resistors a, and с in a parallel 


electric circuit, and is less than either. Thus the Kalman filter has less uncertainty 
than either the observation or the theory. Explicitly we have 


E + | (36.8) 
ga = 29 2 i 
O kF О obs on, 


Co-integration and SSA — Qualitative Remarks 

Co-integration in its simplest form is a special relation that holds if a linear 
combination of the two time series X(t) and X, (1) is stationary. The simplest 
W yrg(É) (ef. Ch. 


43). So for some appropriate time-independent constant 4, co-integration states 


form of stationarity is mean-reverting Gaussian MRG noise 


X (©) - AX,(t) = Wype () (36.9) 


Intuitively the picture is that the two series X (t) and AX, (f) move around 


each other in time, with the difference not increasing with time. This produces 
long-term stability between co-integrated time series. The difference cannot be a 
Gaussian random walk, because the random walk variance is proportional to time 
(resulting from a “unit root" in the Gaussian process), which would lead to the 
series diverging from each other. 

Consider the SSA reconstruction formula Eq. (36.2) using components 


Е, ( J) of фе K^ eigenvector E , Of the SSA lag-covariance matrix. The 
eigenvectors UE, } for co-integrated X (£) and AX,(t) must be essentially 
identical, up to noise from those { E А, with small eigenvalues { A, }. The SSA 


eigenvectors { E; } for large eigenvalues {A,} determine the drift properties of 


the time series. In general the drift changes with time, which makes it quite 
difficult for two time series to be co-integrated. If the drifts of the two time series 
satisfy Eq. (36.9), then the two time series are likely to be co-integrated. 
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NEW: Generalized z-scores: General Measure Polynomials 


To characterize a statistical probability distribution F(x) of a variable x, the 
moments of the distribution are used; the first two are the mean 44 and standard 
deviation с. Set X 2 x — Ш. The z-score =x/o is often used to characterize 
data points of X (e.g. outliers that have large z-scores). Polynomials { / o) 


with respect to (x) can be constructed that generalize the z-score, which we 


show next. These polynomials can be useful to refine the characterization of 


data?" 


Besides useful data metrics, I believe that the general measure polynomials 
can be used for a more complete description of VAR tail events than the expected 
shortfall (cf. Ch. 27) ". 

The polynomials are independent in the sense that they are orthogonal with 
respect to a weight function that is the distribution. Say F(x) 2 0 for all x (the 


discrete version is a histogram). Define the measure dF (х) = F(x)dx, so 


F(x)- Е (x). Then the polynomials | / mog are orthonormal, viz: 


(LN) = олем о) = 6, 6.10) 


Here Ô n is equal to zero for different indices n,m and one if the indices are 
equal. The moments of F(x) with X 2 x Ш are d = f(x — и) dF (x) with 
o’ = са We organize the moments into cumulants? ic " | the first five are 
CHU X eos ыса а C=C a". 


We get (х) 2 1and f/^ (x) = z-score = #/ т. 


7 Story - General Measure Orthogonal Polynomials and Chebyshev: I formulated the 
problem and worked out a few of these polynomials by hand; then in a discussion with 
Alex Grossmann discovered that Chebyshev had solved the general problem. 

Note that these polynomials are NOT the standard Chebyshev polynomials, which 
have a specific measure. 


* Acknowledgement: I thank Stan Maydan for reading to me the section in Chebyshev's 
collected works in Russian on the subject. 


? Another Name for Cumulants — *Connected Parts": Used in high energy physics in 
the decomposition of S-Matrix scattering amplitudes. 
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The first new polynomial that extends the z-score is the quadratic T O) 


which involves the 3* cumulant or skew C, : 


AO (@)=N,|-CF+0°(¥ оз) | (36.11) 


2 
Here N. И is a normalization factor so A has unit length, ( JF j | =l 


The third-order polynomial fi (x) has the 4" and 5" cumulants C,,C,: 
каери есе) есеј] es 


Defining D = aC, — C +2o°, we have 


y= (сс, +90202 - С} 50*С, - 60°) 
= (0°, +C,C,-60°C,) (36.13) 


Also N. à is a normalization factor. 
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37. Correlations, Data, and Random Matrix 
Theory (Tech. Index 6/10) 


In this chapter, we deal with some important aspects of correlations and data. We 
discuss windowing uncertainties, including overlapping vs. non-overlapping 
windows. We discuss uncertainties due to the limited amount of data relative to 
the number of variables. We also discuss intrinsic dynamically generated 
correlation uncertainties. See also Ch. 23-25. 


We end with a discussion of noise-cleaned correlations using SSA (cf. Ch. 
36) and Random Matrix Theory tests, new to this 2" edition. 


Fluctuations and Uncertainties in Measured Correlations 


Correlations have several sources of instability, both statistical and dynamical’, 
Consider this graph: 


—— Corr(Ag, DM) 126 day rolling 
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The graph above shows the correlation’ between silver Ag and the German mark 
DM currency exchange rate to USD in 1992-93. The reader can see that the 
correlation varied widely. This is a typical result. The existence of large 
uncertainties or fluctuations for correlations cannot be in doubt. 


Time Windowing 


We now discuss issues related to the time windows in which correlations are 
measured. One is the window size, and another is whether overlapping or non- 
overlapping windows are used. We treat these in turn. 


Window Size 


The time length of the windows used to get the correlations should be determined 
by financial relevance. We need to use finite-sized time windows in risk 
management because we cannot wait forever to analyze why a strategy might be 
losing money. Time scales for instabilities in correlations can be short. The 
collective panic and flight to quality in markets that are suddenly disturbed by 
bad news lead to sudden changes in correlations. 

A convenient window size for corporate risk management is the reporting 
interval of three months. Analyses that use long-term averages over correlations 
are only valid for long-term buy-and-hold management. 

The standard measure for the uncertainty in a correlation due to windowing 
noise is the Fisher result. If the measured correlation (using samples of size №) 
is denoted as r , and if a true correlation value р exists, then the expected value 


of r is p and the uncertainty of r is approximately (cf. Ch. 38) 


give -(1-7 )/ (N -3) (37.1) 


This statistical uncertainty is only part of the story since strong intrinsic 
uncertainties in correlations exist that have nothing to do with the above formula. 


Overlapping and Non-Overlapping Windows 


The second issue is whether or not to use overlapping windows. The windows 
generally used for risk management are overlapping. The problem is that we 
usually do not have enough data to be able to afford the luxury of non- 
overlapping windows. Measurements in a sequence of overlapping windows are 
serially correlated because the same measurement exists in different windows. 


' Acknowledgements: The Ag, DM correlation is the same example as in my CIFEr 
tutorials 1996-2001. We thank Citigroup for the use of the data. 
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If we are interested in the maximum and minimum correlations, then we want 
to use overlapping windows. This is because each of the windows represents a 
realized physical state in history for a correlation, and we do not want to throw 
any states away, serially correlated or not. 

The fluctuations for correlations that are measured with overlapping windows 
are smaller than correlations using non-overlapping windows for a given number 
of windows. This is because the same measurements occur many times in 
overlapping windows, suppressing the fluctuations, whereas different 
measurements occur in non-overlapping windows, enhancing fluctuations. 


Example of the Effects of Windowing 

In this example all the effects listed above are shown. The example uses 500 
points of a bivariate Gaussian distribution with fixed correlation р = 0.1. We 
then attempt to measure the correlation using windows of different sizes, both 


overlapping and non-overlapping. The results for the measured correlation 
uncertainties along with the Fisher uncertainty are shown in the figure below: 


Uncertainties in Correlations 


—$— Overlapping —®— Non-overlapping —@— Fisher (1-corr^2)/sqrt(N-3) 
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All measures of the uncertainty are similar. The Fisher uncertainty is largest 
The non-overlapping window uncertainties are somewhat larger than the 
overlapping window uncertainties as expected. For corporate risk reporting using 
3-month (65 business day) windows, the overlapping uncertainty is about 80% of 
the Fisher uncertainty. For large window sizes, only a few non-overlapping 
windows are present. The overlapping-window uncertainties have the advantage 
of forming a smooth curve. 


Example for Effect of Jumps on Correlations 
We next show the results when we insert a 10 standard deviation jump up at one 


time ¢ in the second time series, otherwise generated by the above bivariate 


jump 
Monte-Carlo. See also Ch. 25 where a similar exercise was performed. The effect 
of the jump on the correlation depends on what the first variable does at time 


t ump For example, if the first variable does not change at tmp then there is no 


effect on the correlation. However there can be substantial effects if the first 
variable does move. The following plot shows the results for a particularly large 
effect on the correlation by the jump in this model (using overlapping windows). 


Measures of Correlation Instabilities 


—«— Windowed Corr, No jump 


—*— Windowed Corr With jump 


—«— Fixed Nominal Corr 
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Correlations, the Number of Data Points, and Variables 


It seems self-evident that the number XN of data points used to determine 
correlations should be greater than the number of variables p in order to have a 
well-defined data procedure. Mathematically, № > p is required to have non- 
degenerate correlations, and for the Wishart Theorem (c.f. Ch. 38). However, this 
condition may not be satisfied in practice in risk management. For example if we 
have two years of data (and even that is sometimes hard to get), then N — 500. 
However the number of variables p can easily run into the thousands (including 
various interest rates and spreads in various currencies, individual stocks, 
commodities, idiosyncratic risks for various products, and so on). 

We have several remarks. First, if we actually measure correlations with 
N « p and compare these correlations to the case when N > p we find that 
there is not much difference in the values of the correlations. Mathematically 
what happens is that some correlations become approximately degenerate, 1.e. 
take on similar values as other correlations when № < р. However, for large N 
and most data, many correlations with N » p tend to lie close to one another. 


When N « p, the changes in correlations due to the degeneracies are not large’. 


Example with Different Numbers of Observation and Variables 


Here is an illustrative pedagogical example. We can run a multivariate Monte 
Carlo simulation with fixed input correlations. We can vary the number of 
observations N and the number of variables p , and look at what happens to an 


individual specific correlation б. If there is no big effect, then regardless of 
the values of № and р, as we run the simulation we should get the input p, 


on the average, with an standard deviation error roughly given by the Fisher 
windowing uncertainty. 
The particular correlation chosen for comparison in the exercise was taken as 


Pypeciftc = 0.53, with other values for the other correlations’. The value 0.53 isa 
high correlation. Therefore, if there is a problem with the correlation moving 


because of the degeneracies, we should notice it. 
The output Monte-Carlo average correlations for different N,p are within 


errors of the input correlation 0.53. Nothing special or catastrophic happens as 


? Zero Eigenvalues: If NS P the correlation matrix has p-N+1 zero eigenvalues. 


? Acknowledgements: The input correlations come from data ending in 1999. We thank 
Citigroup for the use of the data statistics, run by J. Hill. 
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the N = p point is crossed. The correlation errors are consistent with the Fisher 


uncertainties a -[l- (0.53) ]/ VN -3. 
Here are illustrative results for the average MC correlation: 


Average MC correlation | N — 126 N-252 N = 504 
(nb: within 0.53 + error) 

p=2 0.49 0.49 0.49 
p=181 0.58 0.56 0.56 
p=279 0.53 0.52 0.52 

The results for the statistical errors are: 

e, MC statistical error | N = 126 N=252 N = 504 
р=2 0.07 0.05 0.03 
р= 181 0.06 0.04 0.03 
р=279 0.06 0.04 0.03 

о isher uncertainty 0.06 0.05 0.03 


Bottom Line: In Practice, N > p is a Red Herring 


The results illustrate what is probably a general property. Although the condition 
N > p is required mathematically, degeneracies resulting from measurements 


with N < p do not affect the numerical values of the correlations much. 
The bottom line is that, practically speaking, the academic constraint N > p 


appears to be of little significance, and can be disregarded. The apparent need for 
N > p is a red herring. The procedures of risk management are not endangered. 


Intrinsic and Windowing Uncertainties: Example 


We have buffeted the reader with the statement that correlations have more 
uncertainty than just that given by windowing statistical noise. Deviations from a 


Academic 


putative academic correlation р, are due to both intrinsic and windowing 


uncertainties. 

An illustrative example is the Ag, DM correlation given at the beginning of 
this section. This correlation is measured using overlapping windows. Hence the 
correlation uncertainty at one standard deviation should be less than or at most on 
the order of the Fisher uncertainty if there is no intrinsic correlation instability. 
On the other hand, if there is intrinsic instability, the observed uncertainty will be 
greater than the Fisher uncertainty. 

The idea is illustrated below. 
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Uncertainties in Correlations (Intrinsic, 
a ; А Windowin 
Intrinsic + Windowing 


Uncertainties 


Pap 
Academia 
The Fisher Transform for Correlations 
The Fisher transform REOS of the correlation data r is defined as* 
1 
Roue, = —In| (1e r)/-7)] (372) 
р 2 
Call RAY" the average of the Fisher-transformed data with standard 
deviation o ( Roire ) Then set R, = RAS“ + o (Ras, ), and transform back 


using r, = tanh R,. Finally, define the observed uncertainty in the correlation 


data at the 68% CL using to“ =+(r_ -r y2: 


Data Uncertainties Compared with Fisher Uncertainty 
Here are the results for the Ag, DM correlation: 


^ Fisher's Transform: We discuss the Fisher transform formalism in Ch. 38. 
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Observed Uncertainty of Correlation | Fisher Uncertainty for Correlation 
+47% +9% 


The observed 68% CL uncertainty in the correlation from the data is an 
average of + 47% (40% up and -53% down from the average correlation). The 
observed maximum (82%) and minimum (-59%) are about + 1.7 SD from the 
average (91% CL). 

The observed uncertainty in this correlation from the data cannot be 
explained by the 68% CL Fisher windowing uncertainty of + 9%, obtained using 


ое = (1 – ү? ) / (1—3) , with the average correlation used for r. At ће 


r 


91% CL the Fisher uncertainties predict that the correlation should lie in the 
restricted range (12%, 41%). 

The bottom line is that the Fisher uncertainty cannot account for the observed 
variability in these correlation data. 


The Importance of Intrinsic Correlation Uncertainties 


We believe that the observed uncertainty in this example clearly indicates the 
existence of intrinsic dynamically generated correlation instabilities. We have 
looked at other examples. Some instabilities seem large, as in the Ag, DM 
example. Other instabilities are smaller. A consequence in that case is that no 
“true” correlation exists, regardless of the size of the windows and the length of 
time we wait to measure the correlation. 

We believe that intrinsic instability of correlations should become an 
important modeling issue in the future for financial risk. In Ch. 25 we looked at 
some models for dynamical variations in correlations. In Ch. 23, 24 we discussed 
the stressed correlation matrices, used in the Stressed VAR. 


NEW: Noise-Cleaned Correlations via SSA 


Singular Spectrum Analysis (SSA) can be used to generate noise-cleaned 
correlations with better time-stability, for use in counterparty simulation". The 
idea is to use SSA to smooth out time series, thus eliminating undesirable short- 
time-scale noise and retaining information on relatively long time scales, relevant 
for counterparty risk simulations over long times. 


5 Acknowledgments: I thank Xipei Yang for collaboration for the 2" edition added 
material in this chapter. I also thank Adam Litke, Mario Bondioli, Harvey Stein, Nora 
Omarova, Yan Zhang, Suyan Liu, and Stan Maydan for helpful conversations on SSA 
applied to calculating correlations for counterparty risk. I especially thank Mario Bondioli 
for discussions and calculations on the applications of Fourier Transform filtering 
techniques for reducing noise in correlations. 
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In the derivation of the parameters used in SSA, Fourier transforms can be 
used to isolate desirable low-pass filter properties for noise-cleaned correlations. 


Random Matrix Theory Benchmarks for Noise in Correlations 


Random matrix theory" or RMT can be used to construct metrics to measure 
noise and evaluate the efficacy of SSA smoothing. The basic idea is to test 
whether SSA-based correlations are further from noise than are standard 
correlations. A number of novel RMT tests were performed. 
The RMT benchmark consists of correlations from Gaussian random 
numbers in time series, a zero-correlation Wishart random matrix WRM (Ch. 38). 
1. The means and standard deviations for the probability distributions of ай 
eigenvalues of a zero-correlation WRM were approximately determined, 
analytically. This seems to be an advance over known mathematical 
results”. The widely used Marchenko-Pastur (MP) distribution’ for 
WRM eigenvalues only holds in an unphysical limit (infinite number n 
of data points and number р of time series: п, p — œ with О= п/р 


fixed). For reference, the МР distribution F m with 


род) а), Ass (18070) тз) 


2. The signal-to-noise ratio SNR, defined using the crossover between the 
empirical eigenvalue spectrum and the WRM eigenvalue spectrum’, was 
shown to be higher for correlations between SSA-cleaned time series 


than for standard correlations, SNR( о РИ - > SNR( ro PN : | 


This shows that the SSA procedure gives correlations that have enhanced 
signal with noise reduction as measured by the SNR. 


3. Signal and Noise Consistency Test: Signal Озы and noise f... 


correlation matrices were extracted using the signal and noise regions of 


* Leading and Non-Leading WRM Eigenvalue Distributions: The Tracy-Widom 
distribution (type 2) applies to the probability distribution of the leading WRM 
eigenvalue (ref) I was unable to find results in the literature characterizing the 
distributions of non-leading WRM eigenvalues. Approximate results were obtained by 
Dash and Yang for the distributions of all WRM eigenvalues, described below. 


7 Signal and Noise Definitions: The eigenvalue crossover definition for physical И, pis 
more advanced than the common definition X > А max to define signal and À < X max to 
define noise, which requires the И, p — oo limit. 
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the eigenvalue spectrum of р, and then reconstructing the full 
correlation matrix dimension using the SVD technique described in Ch. 
24. Then the matrices o, апі jo, were shown indeed to have 
ignal Noise 
statistical properties of signal and noise; this is the consistency test. 
4. Fundamental RMT objects were used in tests as metrics. The idea is to 
calculate a RMT object О“? three times: (1): ОМ? using рше WRM 


Noise 
: : (RMT) : А (КМТ) з 
noise, (2): Os. A for SSA-based correlations, and (3): D ns using 
standard correlations. We then calculate the distance of SSA-based 
correlations from noise defined as the L2 norm lox” = p , along 
Р : . А RMT) _ (RMT) 
with the distance of standard correlations from noise, o. dod Мо 


All tests resulted in SSA-based correlations further from noise than 

standard correlations. The tests performed were (see refs. for definitions): 
a. The KL Entropy Divergence (a common measure) 

Paths in the complex plane for the Stieltjes transform. The 

Stieltjes transforms of the distribution of eigenvalues and the 

distributions of eigenvector components were tested. 

The bivariate polynomial 

The general measure orthonormal polynomials (cf. Ch. 36) 

The first free cumulant using the R transform 

The eigenvalue spacing distribution 


moan 


NEW: Approximate Analytic Probability Distribution for ANY 
Eigenvalue in the Zero-Correlation Wishart Matrix 

Next we give a new analytic approximate for the probability distribution of any 
eigenvalue of a zero-correlation Wishart matrix for arbitrary n,p. The 
approximation we use is Gaussian; we need the mean А, and width о, of the 
Gaussian № (4,,0,) we propose as the analytic approximation to the К" 


eigenvalue probability distribution. 
The idea is to modify the MP distribution Eq. (37.3), originally valid in the 
n, p — © limit, to results that are valid without taking this limit, for finite n, p. 


The MP probability density FP (4) is used as follows: 
Define “bins” (b b, ) by numbers fb 


UN Ae k=1...p with b, ,» b, and 
b, = ies b, = As The bins are determined by slicing up the MP probability 


into p equal amounts, viz: 
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f F(A)da= = (37.4) 
A p 
Now define numbers {4 | at the probability centers of the bins 
A, by, 
f F™(A)da= f F™(A)da (37.5) 
b, A, 


Finally we define the approximation А, for the mean of the distribution for 
the К" eigenvalue as 


A, = ——— p (37.6) 


р 
This is designed to get the means to add up to the right value, > A, 7 p. 
k=l 
Note that by definition, using Eq. (37.5), А, is ће mean of the distribution 
for the largest eigenvalue. 
The standard deviation o, of the k" distribution is defined by making the 


area for +] standard deviation the usual percentage 68.3% of the total Gaussian 
area, the percentage taken of the width of the bin", viz 


20, = (b 


k 


47 5)*68.306 (37.7) 


The analytic approximation to the probability distribution of the К" 
eigenvalue of the zero-correlation Wishart matrix is N (4, 30, |, 


Numerical Accuracy 
The above analytic results are reasonably accurate numerically’. 


* Homework: Generate some time series with Gaussian random numbers and obtain their 
correlations. Then get the eigenvalue spectrum of the resulting correlation matrix (which 
is itself random). Redo the calculation a number of times and plot the distribution you get 
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A Few Other Aspects of Data and Correlations 


The Importance of Conventions 


You should keep track of the conventions and be able to defend them. As we 
have emphasized, time windows of different length (e.g. 3 months or | year) can 
produce different results, and are relevant under different circumstances. Even 
minor conventions for business days (bd) are important, e.g. 260 bd/yr or 250 
bd/yr (2 weeks of holidays). 

There are potentially dangerous sociological situations that can be associated 
with conventions. Non-technical people can latch onto small differences in 
calculations and assign illogical importance to these differences. In particular, 
time can be wasted ferreting out the reasons for small differences in badly 
documented alternative calculations that depend on different conventions. 

On the other hand, sometimes such differences are large and warrant scrutiny. 


Data Problems and Positive-Definite Correlations 


Any “garbage” in the data will produce a breakdown of the correlation matrix 
away from its theoretical positive definite attribute. Dealing with non-positive 
definite matrices requires the use of the singular value decomposition procedure, 
as described in Ch. 24. 


Cleaning Up the Data 


“Cleaning up the data” can be a big issue, which can involve interesting technical 
and sociological ramifications far beyond the scope of this book. 
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for the values of each eigenvalue (e.g. the 5" eigenvalue). Compare to the analytic results 
above. What do you think? Pretty cool, yes? 
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Wikipedia: http://en.wikipedia.org/wiki/Marchenko%E2%80%93Pastur_distribution 


" Gaussian (Normal) Distribution 
Wikipedia (see figure of Gaussian): http://en.wikipedia.org/wiki/Normal distribution 
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38. Wishart’s Theorem and Fisher’s Transform 
(Tech. Index 9/10) 


In this chapter, we consider the mathematics and theory for three topics. They are 
(1): the Wishart theorem, (2): the Fisher Transform, and (3): Implications for 
correlation uncertainties due to sampling error’. John Wishart, in a brilliant 
exposition’, generalized earlier work" to obtain the distribution of standard 
deviations and correlations obtained from a sample of N measurements of p 
variables, assuming that all variables obey a multivariate Gaussian distribution. 

For the case p = 2, we will discuss theoretically the correlation uncertainties 
due to sampling error using finite windows, and give a simple derivation of the 
results using the Fisher transformation. We will also discuss the Wishart theorem 
using Fourier transforms that we believe gives some insight’. 

Of course, statistical uncertainties exist. It is important to have a handle on 
the statistical sampling uncertainties. Still, other non-statistical uncertainties are 
probably bigger than statistical uncertainties, especially in stressed markets. 

Although the Wishart distribution is well defined only if the number of 
measurements exceeds the number of variables, i.e. N > p, risk management at 
the corporate level often assumes multivariate Gaussian behavior for many 
thousands of variables but yet generally works with only a few years of data at 
most, so in practice N « p is unavoidable. As was mentioned in the previous 


chapter, the use of N « p merely implies numerical degeneracies of correlations 
reducing the dimension p down to some effective dimension p,p <р 
satisfying № > p,p. We presented some numerical examples that indicate that 


these degeneracies do not significantly impact risk analysis’. The significant 
uncertainties (due both to intrinsic instabilities and windowing noise) 


' Acknowledgement: І thank Ardavan Nozari for pointing out the Wishart theorem, and 
for informative conversations. 


? History: The work described in this section was mostly done in 1999, with additional 
work on the Wishart distribution in 2002-03. 


? Corporate VAR: The statement that N < p does not really cause problems is good news 
for corporate risk managers who need to quote a risk measure like VAR, and who live in 
a world where the number of data points available is less than the number of variables 
needed. Otherwise, the Wishart theorem would rule out their jobs. 
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overshadow the smaller uncertainties due to the imposition of partial correlation 
degeneracies due to N « p. 


Warm Up: The Distribution for a Volatility Estimate 


In this section, at the risk of boring the reader, we derive the well-known results 
for the distribution of a volatility estimate, mostly in order to introduce the 
Fourier transform formalism that used below to discuss the Wishart theorem. The 
techniques are quite general". 

We consider a variable x, (with 4 =1...N being a time index). By x q we 


really have in mind time differences or returns, but to simplify notation we do not 
indicate the time difference operator d,. We simplify notation further by taking 
the probability distribution of each x, to be Gaussian with unit variance and zero 
mean, so the integrated probability distribution is 


poread = [ap = И j exp(-x; DE (38.1) 
At 


P (Integrated ) 
N 


Now is exactly 1, but we will soon restrict the integration region. The 


N 
estimated variance is s^ = E . This formula contains a factor JN —1 in the 
4 
definition to avoid cluttering up the page (at the end of this section we will 
redefine the variance in the more conventional notation). We want the 


distribution 2, (s^) with ДИ") = Í P, (5° as? . To this endř, we insert 


the factor 1 in the above integral and substitute for it the right-hand side of the 
Dirac delta function identity 


ыб dae (38.2) 
0 


We then use the Fourier transform representation of the delta function 


* Generality: For example, the F-distribution can be derived this way, etc. 


? Dirac Delta Function Trick and G. Parisi: By now the reader should be familiar with 
this trick, used many times in this book. The idea comes from a paper (long lost), written 
by the physicist Giorgio Parisi around 1975. 
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2 -$4 ]- IDE 2x35 (38.3) 


oo 


We remove the integral f ds? to get 
0 


do еї®з? Mf dx, 
exp| — (io +1/2)x (38.4) 
E. ш: ) 1 V20 
We can do each Gaussian integral to get 
A (s*)= [2869 — (38.5) 


2л 2" (ig 41/2)" 


Now we have s^ >0 so we can do ће @ integral using complex variable 
integration. We close the contour in the upper half (UH) @ complex plane with 
Im o > 0. The integral vanishes for the semi-circle at infinity due to the factor 
exp(-s? Im о) — 0.If М 2 2M is even, there is a multiple pole at 1/2 on ће 


imaginary @ axis. 
To proceed we introduce an auxiliary parameter с and write 


(-1)"" ee 2 ios? 1 


2 = . nE, 
2л, (9) = 2" (M -1)! oZ" 5 2л (io - £ +12), PS 


Here, the parameter ¢ is kept as б > —1/2, and then is set to zero after the 
derivatives are performed. 


This integral has a single pole at o? = =i(¢ +1/ 2) in the UH o plane, 
producing exp Еб +1/ 2)s | Performing the derivatives, using the gamma 


function" definition Г (M ) = (M - 1)! and replacing M = №2, we get the 
usual result 


1 


Ab mum e авл) 
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Although derived for N even, we can analytically continue the result in Eqn. 
(38.7) to N odd. We note Г(2+1) = zF(z) and Г(1/2) = Ут. 


2 


Xx 
Then ДС") [s < x | = ex (s°)ds° is the probability that the 
0 


г В . 2 
variance estimate is less than or equal to Y^. 


We now eliminate the simplifications made above. We redefine the Gaussian 
probability distribution to have a mean yw and volatility с. We need to include 


_ 1 ЕТМЕ т 
the average х = —У х; . We do this by inserting the additional constraint 
Nia 


оо oo — N 
12 [| Poe io[s- 1. | (38.8) 
Л 


1 < 42 | is ; 
We also set s* = — У (x 7 -х) , restoring the usual definition. Performing 
N-17 


the extra two integrals results in an extra factor replacing the order of the 
singularity N/2 —> (N-1)/2. Hence, with all factors, setting 4 = V(N -1) ; 


we now get 


P,(s°) ds? = ds? #7 ad __5 (38.9) 
` 20° AT ((N -1)/2) 20° 267A | 


This produces (s?) = с?. Note that we need N 22 in order that the 


singularity in s? is integrable as s^ > 0. 


The Wishart Distribution 


Consider a. p - dimensional world with variables 1x4] , Where a = 1...р is the 


label for internal degree of freedom (e.g. interest rates, commodities, etc.) and 
4 21..N the time index. Again, we have in mind applying the formula to time 
differences or returns, but again for simplicity, we do not indicate the d, 


notation. Define the quadratic form V, by? 


$ Comment on the V matrix: Note that there is no correlation matrix in the definition. 
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N 
V > ty (38.10) 


- 
ll 
Ju 


As above, we first simplify the arithmetic by assuming that the ix, D variables 


are distributed according to a multivariate Gaussian distribution with unit 
volatilities, ignore the average constraints, and denote the given correlation 


matrix by ( Pap ) . The Wishart distribution for the probability that V,, = б in 


volume Iv; is” 
asp 


(деги) exp| -Tr(V p” 2] 
(det pu X 


(И) = (38.11) 


Here the “kinematic factor" K = 2^"? MTT P(N =a +1)/2 | is a 
a=l 


constant depending on №, p and not on the dynamical matrices V, р. 


Note that if p 21 we have detV = s^ and p =1 so we just reproduce the 
distribution we had for the estimated volatility above. To keep the distribution 
finite for general p as det V — 0 we need the restriction N > p. 


If we impose the average constraints, we replace N — N —1. The unit 
volatility assumptions are relaxed through scaling, and the normalization with 


Am 1/ (N – 1) can be included explicitly as above. 
The expected value of V,, is just p,,. Including volatilities {o,}, the 
expected value of V, is С, P,,0,. The uncertainty of V; around its expected 


value is the subject of the Wishart distribution. 
We discuss the form of the Wishart distribution below. 


7 Notation: Please do not confuse the Latin letter p used for the internal dimension with 
the Greek letter p used for the correlation matrix. 
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The Probability Function for One Estimated Correlation 


Before launching into a general discussion, we consider the case p = 2 with one 
off-diagonal correlation element p, in the original 2-dimensional Gaussian 
probability distribution for each x,,,x,,. We (hopefully not too confusingly) 
denote о, by р, so the determinant of the correlation matrix det о in this 


notation is det o 21— p. We want to integrate over the volatility degrees of 


freedom to find the distribution for the measured sample correlation r relative to 
the given correlation p in the probability distribution. Here r is given as 


N 


nyc » -3 (x, 77%) rss; (N -1) (38.12) 


We replace N + N -1 inthe p = 2 Wishart distribution to include the x, ,, x, , 
averages. The resulting pdf is originally a result of Fisher ". We integrate over 
ds;ds; by changing to elliptic co-ordinates s, =o cosy, s, =o,ésiny 
with v € [0,л/ 2]. All dependence on the volatilities eventually cancels, so we 


can take unit vols. During the algebra, we encounter the Г function 


ery’ ay =Г(№-1) where y= N£ (1-rpsin2y)/(2det p). We get 
0 

а sin2y) 

the integral / = f dy ( v) 

0 (1 —rpsin 2y) 

series in rp and integrated term-by-term using the Euler beta function ". We 


v. This integral can be expanded in a 


obtain the result for the probability distribution e(r) of the correlation r with 


N measurements, given a “true” value for the correlation р, as? 
, 


r^ [(N «k-1/2](2rp) 


ЕСУГЕ Л), и 


(38.13) 


* Equivalent Expression: Another expression for the correlation pdf exists with a 
hypergeometric function, which is numerically identical with Eq. (38.13). I thank Mario 
Bondioli for demonstrating this equivalence. 
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Fisher’s Transform and the Correlation Probability Function 


The above sum is, to say the least, unenlightening. Progress can be made using 
the Fisher transformation 


Р ра, =<In[(1+r)/(1-7)] (38.14) 


The inverse is r = tanh R As г ^ +1 we have R — too. We also 


Fisher * Fisher 


write p = tanh R}. 
As the reader may surmise, we are after a Gaussian approximation to e(r) 


as a function of A, . To this end, we assume that Л is large and use various 


asymptotic formulae. We also need the WKB approximation, which we discuss 
next. 


WKB Approximation and Fisher’s Gaussian Approximation 
Notice that, apart from the squared Г function, the sum Eqn. (38.13) for e(r) 


looks like the exponential sum. As a warm-up, we note that the WKB method " 
can be used to obtain the dominant region in the exponential sum, 


expv = Уу / k!. We use Stirling’s asymptotic formula " k!xk*e“./2z/k , 
k=0 
along with an integral approximation to the sum. This produces 


Y v RI [ak I K/27 exp| (4) | , Where the function in the exponential is 
k=0 


(kK) =-klnk+k+kInv. The WKB method involves expanding (Kk) to 
quadratic order around the stationary point А, where ®'(k, ) — 0. We find 
k, — v as the stationary point. Carrying out the Gaussian integral, we obtain 


ехру = v / k! consistently. 


k=0 

We return to the evaluation of (r). The asymptotic formula 
T(z + a) e z'T(z)|1 + a(a -1)/(2z) | is used, with z = (N -3)/2 and with 
appropriate values of a for the various gamma functions. We then have an 


approximation to the sum for g(r) as an exponential sum. Here, 


v=(N —3)tanh К, tanh R,. Using the WKB result, we replace k — v for 
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non-dominant terms at large N. We use Taylor expansions to first order, 
In (cosh? R) = А? and tanh А = А in order to get the leading quadratic terms in 


the exponent. We then obtain Fisher's approximate Gaussian result 


(к) = Ило? СЕИ -R | (38.15) 
R 
Here, с? = 1(% —3). 


Fisher’s Volatility: Statistical Noise of Estimated Correlation 


Fisher’s result is an approximate Gaussian behavior in the variable R,.,.. with 
Fisher 


an expected value of R, = zi[ü + p)/ü = р)| апа an approximate width of 


Op = 1/ N.N — 3. This result can be used conveniently to set confidence levels 


for correlations in measuring the windowing noise. We can re-express the Fisher 
width for the estimated correlation r at one SD uncertainty using 


(of) =((r- py) = | x | (Unas - 8) (1) м) 


Fisher 


(38.16) 


Fisher 


The quantity с, was used in the previous chapter in the numerical correlation 


Fisher 


uncertainty study. We notice that the (1 -r) factor makes o, vanish at 


+1. Naturally as № — oo, the statistical uncertainty is zero and the value r = o 
is recovered. We get the probability explicitly in terms of the usual correlation by 
changing variables; after some algebra we get the approximate Gaussian form 
directly in the correlation variable: 


(r-p) 
g(r) : р (38.17) 


The approximation Eq. (38.17) holds for correlations not too close to the 
boundaries +] and reasonably large №. In practice, the equation holds quite 
close to the boundaries and for N not very large. 
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We re-emphasize that this entire framework assumes that there exists in fact 
one given value for the correlation. Perversely, in the real world, correlations 
have important non-stationary intrinsic instabilities that have nothing to do with 
the statistical uncertainties described by the above results. 


NEW: Hedge Fund Style-Change Risk and Correlation Changes 


Monitoring the activities of institutions (e.g. hedge funds HFs) that claim to 
follow a certain “style” or strategy can be complicated. First, information on an 
HF investment strategy is fragmentary. Second, this fragmentary information is 
provided only periodically (e.g. monthly). 

The correlation distribution can be used for a style-change risk metric. 


Consider a hedge fund HF Авс SUPposedly following a strategy S . The idea is to 
look at the time dependence of the correlation p| HF ,I dl between the returns 


of HF pc and the returns of an index 7 (5) comprised of HFs that follow strategy 
S. If this correlation deviates outside a Fisher standard deviation, there is an 
indication that the НЕ, pc strategy changed; this is called a "style-change". 


Difficulties in practice include substantial noise due to limited statistics. 
Also, the HF may actually be employing a variety of strategies. 


Derivation - Fourier Transform of the Wishart Distribution 


We now give some details regarding the derivation of the Wishart distribution. 
Wishart used a geometrical proof. Here, we follow the Fourier transform method, 
generalized from the above discussion of the p =1 case for the chi-squared 


distribution for the estimated volatility. The Fourier generating function of the 
Wishart distribution is straightforward to obtain and agrees with the known 
result. Evaluating the integrals of the Fourier generating function to get the 
Wishart distribution is a difficult exercise in multi-complex-variable contour 
integration, and the proof is not complete. Nonetheless, the origin of all the 
component factors in the Wishart distribution can be seen in a rather 
straightforward fashion. This gives insight into the inner workings of the Wishart 
distribution. 


Derivation Outline; the Dynamic Components in Wishart's Distribution 
We begin with the assumed multi-normal form for the ix, E variables (with unit 


volatilities and assumed zero average value to simplify the notation), 
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x 1 1 dx 
dP, = | |42, -——— exp|-- У xp apt | а. (3818) 
M П 4 ( det p)” | P» LP op* pl П руч 


We immediately see опе factor in the Wishart distribution, (det p)" à 


The symmetry in the internal variable labels с, 8 and the fact that only 
c € В variables are independent plays an important, if annoying, role. We use 


the Dirac delta function trick for the independent V,, with æ < 2, 


© N 
1- [T] aor, dw (38.19) 


а<В -o 
Introducing the variables {ap } we write the Fourier transform as 


do,, 
2л 


N 99 N 
1 Е NM = Nop f exp a (ra Е хаи J 
1 a =| 


(38.20) 


Here, Naa =1, and т =2 if о < B. We now rewrite the integrand to get rid 


of 7,, in favor of summing over all œ, with the understanding that 


N N 
[ [exp то, g - ми i = П eni, (ra E $n] 


ax All a, f 


(38.21) 
The reason for this step is that identities we need only hold for sums over all 
a, В. Still, the integrations are only over those {ap} with а < Д. 
We note that Пр EZA = exp [Т^ (фу) |, which appears on the 
а,В 
right hand side of this equation. If we replace the matrix @ with the matrix 
0% —ip! /2 , we would get the quantity exp |-m(ve )/2]- This is 


another factor in the Wishart expression. In fact, this replacement is exactly what 
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happens during the integration over the {ap} variables in the multidimensional 
complex plane. 


The last factor in the Wishart distribution, (дег )* р-1)/2 


, arises from 


derivatives of the quantity exp [-¢ 2 with respect to the determinant of 


derivatives of an auxiliary matrix ¢ . The auxiliary matrix generalizes the p =1 


auxiliary variable whose derivatives are needed to turn a simple pole into higher- 
order singularities. 


Result - Wishart Distribution Fourier Transform 


As just mentioned, we need an auxiliary matrix ¢ with matrix elements б; 
these are eventually set to zero. We multiply the probability distribution by the 
" zi " А 
factor Е Y DEM . The matrix о +2i@+2¢ then appears in 
а,В=1 (=1 


the quadratic form; we rewrite this as 2i(@- icq ) where o" — ip! / pm 


The integral over the Us J can be done immediately and produces the factor 
yr det ( (o - oo – і la . The result for the Fourier transform, i.e. the 


peA ү is then 


integrand of the н \ variables, up to a constant 2” 


ap 


Gy = (det p) "^ П | dV,, ` exp Lir (oV) |. | det(o - о! 50)" 


asB о 


(38.22) 


This is the Fourier generating function for the Wishart distribution and agrees 
with the literature (see Evans et. al.)’. 


Performing the Multiple Fourier Integrals 
We now have to do the Fourier integrals. Counting variables, we have 
Р( р+1) 2 integrals over the {ap} variables with a € 8. Just as in the 


р =1 case, the order of the singularity needs to be reduced to correspond to 


Я " (0) е (p+1)/2 B 
the order of the integration. We want | det(o - —ic ) since the 


determinant itself is of order p. We need to differentiate with respect to 
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det (6/ 06 m . A difficulty, discovered by hand calculations and Mathematica’, 
-1 

is that det (4/2¢,,,)| det(o -o -i¢)| — 0. However, we have the more 

general result, a special case of Cayley's theorem “, 


=1+4 2+4 


det(2/oc,, )| det (o -0 -i£)] = -A(1* 4)| det(o-0” -i¢) | 


(38.23) 


We set A= (p –1)/2. To get | det(@- o? -)]- we differentiate 
ае (о =a" эү by det(0/0c,, ) a total of (N-p –1)/2 times. 


We then pull the derivative | det (д/ OC op j outside the {ap} integrals. 


Using the Multiple Cauchy Theorem 
-(р+1)/2 
We аге left with the task of integrating | det(@ – o -i¢)| dii . The 


determinant det (c -o —ió ) consists of a sum of terms, each term 


containing p factors. We need to look at the zeros of the determinant in order to 


apply the generalization of Cauchy's theorem". We can get the determinant to 
vanish if we set to zero each term in the sum comprising the determinant. 
Consider the change of variables: 


(v - o? 20). = 6,86 ` Vap -exp|i(¢, +, )/2 | (38.24) 


Each term in det (o - 0% -i£) will contain Пг. ехр(1ф„) аз а 


a=l 


common factor. This factor can then be pulled out of the determinant. Taking =, 


as a small number defines a small circle in each @,, complex plane about the 


point (o? +i¢ ) Р. All factors LA cancel. Cauchy's theorem can then be 


used for each @,, separately. We recall that o” =i p | 2 . For nonzero results 


? Acknowledgement: I thank Tom Gladd for verifying this special case of Cayley's 
theorem using Mathematica. 
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we close in the @,, upper half plane if P > 0, which requires V,, > 0 for 
the vanishing of the contour at infinity. In the other case a <0 that can 
result if a < B, we need Vip <0 in order to close in the Onp lower half plane. 
The Уш; matrix prevents the vanishing of the sum of the Levi-Civita alternating 


signs ec — t] in the determinant. Any Yap With det у = 0 can be used. 


0105 ...0} 

Applying the multiple Cauchy theorem on the Р( р+ 1) / 2 variables {ap} 
means that we make the replacement (o — o + iG in the factor 
exp [ iTr (ФУ) — exp |-7 (Vp yh | exp | -Tr (£V)| . Using the following 


identity (N -p- 1) / 2 times then completes the argument: 


det (0/0¢.,, ) -exp (СУ) | = (-1) дегу. exp| -7r(¢V) | (38.25) 


This concludes the discussion of the origin of the dynamical terms in the 
Wishart distribution. 


Limitations of Above Evaluation of the Wishart Fourier Transform 


Although we have obtained the Fourier generating function and the dynamical 
terms in the Wishart distribution, our proof using the Fourier-transform method is 
not complete. The first problem is deriving the constant K . It seems that K 
can only be obtained here by requiring that the total probability, the integral over 


П 1 dV,,, is one". The second problem involves the multiple Cauchy 
aSB о 

theorem. First, we placed conditions on the signs of { Vig} for a < P depending 
on the signs of p gs Second, the determinant det(o – 2% іб ) can vanish 


through cancellations between its various terms. Treating the resulting 


interdependent singularities in the {ap} integrals is extremely difficult ". 


1 Remarks: We cannot set A = (р-1)/2 in the coefficients because some derivatives аге 
then zero. Instead, we keep X free in all constant factors. We also have a dependence on 
the y matrix above. 
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I would appreciate learning if a derivation exists in the mathematical 


literature resolving these difficulties in evaluating the Fourier transforms using 
the multiple Cauchy theorem !'“". Since the 1* edition, I have had no response. 
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39. Economic Capital (Tech. Index 4/10) 


In this chapter, we discuss Economic Capital, mostly in a qualitative fashion’. 
We describe standard procedures and assumptions as well as problems and 
issues. Many relevant quantitative issues that play important roles in the actual 
calculations are discussed in detail elsewhere in the book. 


Basic Idea of Economic Capital 


Economic Capital? (abbreviated EC in this chapter) is now becoming a common 
barometer for corporate risk. EC can be defined as the amount of liquid capital 
needed to enable a firm ABC to survive (i.e. not default) under extreme and 
unexpected adverse conditions. These conditions are taken to last for a given 
period of time т. This definition, turned around, implies that АВС should be 


able to lose an unexpected amount of money EC in time т without 


defaulting. Often т = 1 yr is chosen’. Other definitions for EC are sometimes 


used, e.g. the estimated amount of capital needed to obtain a given credit rating 
by a rating agency". Another is the hypothetical premium needed for an outside 
party to insure against default. These definitions are related but not equivalent. 

Although no pot of money is literally set aside as economic capital, we will 
regard EC as real concrete assets needed to cover losses. Following Moody, we 
assume that these assets are "permanent and immediately available to absorb 
losses before general creditors are affected in any way" ' . 


' History: Most of the work for this chapter was done in the period 1999-2001. 
? Units: Economic Capital is measured in USD, or whatever the reporting currency is. 


> Why one year? Sometimes it is said that one year is a “reasonable” period to require 
for solvency before the company can access the capital markets to get more capital. 


^ Rating Criteria: The rating criteria in reality are much more complex than just capital. 
Stable revenue and sufficient cash flows are essential. See Moody's Rating Methodology 
Handbook (Ref). For this reason alone, Economic Capital can only be a rough measure 
for ratings. In this chapter, it is sometimes assumed for illustration that ABC is an Aa 
(AA) rated bank or broker-dealer. 
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There are return/risk ratios that utilize EC for the risk in the denominator, 
giving return on economic capital as a measure of business success. Thorny 
problems of whether or not to include diversification effects for a given business 
unit exist, and will be discussed below. Another thorny problem is whether to use 
the Sharpe return/risk ratio that penalizes positive and negative returns equally, or 
a Sortino ratio " that only penalizes the downside. 


Regulatory Definitions of Capital, and VAR 
Regulatory definitions of capital are in flux™ 
time-average VAR’. 


. They involve VAR or a multiple of 


Adverse Conditions for EC , and an Insurance Analogy 


A critical variable is the choice of the meaning of “adverse” conditions or events 
for EC. A-priori there is no "right" or ^wrong" definition. We will consider 
some examples below. It is convenient to think of EC as considered to act as the 
capital backing a fictitious "insurance policy" for ABC , with the policy written 
by ABC to cover itself, ABC . Adverse events are supposed to be covered by 
this “insurance”. Naturally, the extent to which these adverse events are far out 
on the statistical tail 1s a critical consideration. If such an event (call it an 
“asteroid” ° or a bad “earthquake”) does occur, losses depending on the exposure 
of ABC will occur. 

Either the EC is enough to cover these losses or not (in which case ABC 
may indeed default). The EC also has to be in a form that will enable ABC to 
cover the losses and still stay in business. If illiquid assets need to be sold, 
probably into a hostile market, substantial additional losses may occur. On the 
other hand, keeping the EC permanently in liquid assets in case it might be 
needed, means that a low return will be suffered, and business opportunities may 
be missed. 


Example of the Insurance Analogy 


A mundane illustration may help. Assume that you do not have an earthquake 
insurance policy with an outside insurer. It is neither right nor wrong to have 
enough assets (your “economic capital") to insure yourself to cover potential 
severe earthquake damage to your house (your *exposure"). The probability for 


5 Regulatory Capital: See the voluminous and constantly-changing Basel regulations in 
the BIS docs (refs). The use of some form of VAR for capital has remained. Our purpose 
here is to give insight and motivation. 


* Asteroids: The word “asteroid” is used in this book for dramatic effect just to indicate 
the sudden onset of a severe problem from stressed markets. 
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an earthquake may be far out on the probability distribution, and may never have 
occurred in your area. 

If, nevertheless, an earthquake does occur, you need enough capital to rebuild 
the house to a livable state. Your capital may be liquid (e.g. cash in a money 
market account that returns little) or your capital may be tied up in illiquid assets 
(some of which you may well have to sell at a big discount). At the end, if you 
don’t have enough money to rebuild the house to a livable state, then the rule 
says that you “default”. 


Stress Testing and EC 


Stress tests can include what-if scenarios, historical scenarios, and statistical 
measures. We have discussed these earlier in the book, and provide some more 
insight below. A variety of these tests is performed at various institutions". Each 
of them can be used for a part of Economic Capital. 


What-if Scenario (WIS) as Indicators 


What-if scenarios (WIS) can be envisioned for quantification of EC in some 
cases. WIS have been and will remain a staple of risk management. At the same 
time, WIS are clearly just indicators. WIS can be formulated in a number of 
ways. These include: 


e Numerical (specified changes in stock indices, bonds, FX, commodities, etc.) 
e Economic or political (deep recession, default of large banks, wars, etc.) 


It is clear that there is no point in considering something like a “Rand 
Corporation nuclear-war" scenario, i.e. a scenario in which the adverse climate is 
so severe that ABC will not exist at all regardless of any EC consideration. 

Therefore, the scenarios we want to consider are essentially serious “mid- 
level" disasters. 


Numerical What-if Scenarios 

Numerical WIS are very common. Based on history or an estimated projection, or 
some other idea, changes To n are postulated for variables ix] for 
scenario 8. For example, scenario / might assume that (over a period 
Tac =1 yr), the S&P500 drops 40%, oil prices rise 30%, the USD drops 10% 


with respect to major currencies, gold rises 25%, etc. Note that correlations are 
built into such a what-if scenario (i.e. stocks down, oil up means р, оу <0 


over that time period). 
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Underlying Economic or Political What-if Scenarios 


Economic or political WIS involves questions like: “How much capital do we 
need for ABC to survive under a deep recession?” In order to quantify this, we 
clearly need a definition of a “deep recession” and we need to attempt to 
calculate the result of the existence of a deep recession. This still means 


8 (Recession Scenario) 
Xa 


specifying changes { and the time interval over which these 


changes occur ”. 

It should be noted that although a deep recession may not have occurred in 
the last 30 years since the invention of modern finance, this does not mean that a 
deep recession cannot occur. The same holds true for other possible disasters. 


Probabilities, Entropy, and What-if Scenarios 


It is useless to try to calculate the probability of some given definition of a WIS 
such as a given numerical scenario or the results of a deep recession. This is not 
so important. It is a red herring to argue that if you cannot calculate the 
probability of a scenario, you should not think about it. A given WIS specifying 
many details will have a very small probability. However, the possible number of 
ways that some disaster can occur (the "entropy") is very large. Therefore is 
perfectly reasonable to consider a representative disaster using a WIS. 


Historical Scenarios (HS) as Indicators 


Historical scenarios involve questions like: *How much capital do we need for 
ABC to survive under a series of events like the stock market crash of 1987?" If 
so, we calculate what the consequences would be if an historical 1987 scenario 
were to repeat. Of course, history never exactly repeats, so HS are also just 
indicators. 


Statistical Measures (SM) as Indicators 

Statistical measures involve questions like: *How much capital do we need for 
ABC to survive at a probability level of 99.97%?” This latter number is the 
average one-year historical default probability for Aa-rated companies from 
Moody" over 1970-19988. SM have to be defined with respect to some statistical 
calculations, and there are a variety of uncertainties regarding such calculations. 


7 Economists and Financial Variable Changes: Economists may naturally be cautious 
about specifying too much detail about the changes in the variables needed for a risk 
calculation. This means some ad-hoc assumptions will still be needed. 


* What's so Special About the 99.97% Confidence Level? Other time intervals in the 
data naturally produce different numbers, e.g. 99.93% over 1920-1996. See Moody (ref). 
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We have spent a lot of time in this book discussing various sorts of SM, e.g. 
Stressed VAR with fat tail jumps and stressed correlations. In the end, SM are 
also indicators. 


The Classification of Risk Components of Economic Capital 


Economic Capital has many components. After all, EC is supposed to represent 
capital needed to survive against all risks. Traditionally three risk categories 
(market, credit, operational) are used for classification in EC . Although not now 
included in EC, there are two other important categories of risk (systemic, 
climate change). The risk category list is thus: 


Market Risk 

Credit Risk 
Operational Risk 
Systemic Risk 
Climate Change Risk 


We have spent considerable time on market risk. The best candidate for a 
real-world assessment of the market risk component of EC, in our judgment, is 
the Enhanced/Stressed VAR (cf. Ch. 27). We discussed credit risk in Ch. 31. 
Examples of operational risk are model risk (Ch. 32), data risk (Ch. 36), and 
systems risk (Ch. 34), plus scattered comments. 

The fourth category is Systemic Risk" resulting from collective interactions 
of financial institutions, thus accentuating collective instability. By definition a 
silo-based risk analysis by an individual institution is blind to systemic risk. 
Fundamental industry-wide collective financial instability is ignored by only 
considering market-credit-operational risk on an entity basis. 

The fifth category is climate change risk as discussed in Chapter 53, along 
with other external / environmental risks. 

The enumeration of risks is very large and by definition of Murphy's Law’, 
incomplete. Also, the placement of an individual risk is sometimes unclear '°. 


? What's the Next Risk Type Surprise? While market risk has large gaps from time to 
time, and while credit failures can be spectacular, the really dangerous possibilities lie in 
operational risk, systemic risk, and climate change risk. 


1° Would Linnaeus Agree that This Classification is Complete Enough? With the 
triumvirate market-credit-operational classification along with the added systemic and 
climate change categories, any risk has to be shoehorned in somewhere. One issue 
regards the multitude of possible risks. A conference speaker once wrote a long catalog of 
risks on one slide for emphasis. The font was very small and the slide appeared black. 

A pesky issue regards combination risks. An example is model risk for convertible 
bonds. The model risk shows up as part of the bid-ask spread (market risk), depends on 
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Consistent vs. Inconsistent Calculations of EC Components 


The practical calculations of the various components of risk comprising ЕС are 
done in a variety of ways. This is because it is not possible to calculate all risks in 
a consistent framework. No single methodology is rich enough to cover the 
estimation of disasters and it is sometimes not possible to ensure that risk 
assumptions are internally consistent from one area to another. Judgment and 
policy therefore play a role in practice. 


Exposures for Economic Capital: What Should They Be? 


EC is generally calculated using exposures as of a given date. However, a 
forward-looking measure is desirable. In the next chapter, we will discuss a 
framework for an estimate of ЕС for unused limits, that is, for businesses to 


change the exposures within their limits during the period т. in the future. 


Attacks on Economic Capital at High CL 


Various critiques have been levied at Economic Capital, sometimes by smart 
traders pushing back, and sometimes by quants. The main issue is that Economic 
Capital is expensive, focuses on rare events that are hard to measure, and might 
be used in assessments of risk that influence compensation. 

We present three arguments, which we call “attacks” on EC , because that is 
what they are. These arguments have varying degrees of relevance. 


First Attack Misses: Lascaux Cave Paintings and an Ergodic Statement 


The use of high confidence levels is often attacked using what amounts to an 
ergodic statement. For example, suppose that we have a l-year time frame for 
EC and that we take the Moody's 1970-98 Aa default CL = 99.97%. This 


translates into a default probability of 3/10,000. The argument says that this is an 


absurd measure because (and this is the ergodic statement) we cannot look at the 
worst 3 out of the last 10,000 years, taking us back halfway to the time of the 
Paleolithic Lascaux cave paintings”. 

This argument is a red herring. Naturally, we do not want to argue that 


anything that happened in prehistoric times has much to do with (e.g.) swaps 


corporate credit spreads (credit risk), and several different possible models could be 
chosen (operational risk). One procedure, which we adopt here, is just to put model risk 
into the catch-all operational risk. 
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traders. In fact, we do not want to use such an ergodic statement, and we are not 
forced into using it by false consistency. 

The proper argument is that we have some information (albeit imperfect) 
about default statistics from what happened in the 20" century. We use this 
information for probabilities of default in 10,000 states starting now. These 
10,000 states can be generated, by Monte Carlo simulation. 


Second Attack is Closer: Not Enough Companies 


The second argument is that using this high CZ is absurd because we do not 
have enough Aa companies in the historical data for robust probability estimates. 
This is a better argument. Moody's (exhibit 34)" shows that only one company 
defaulted from 1970-1998 that had an Aa rating at one-year prior, DFC Financial 
(Overseas) Ltd on 10/3/89. Different periods of time do produce different results 
for default probabilities. 


Third Attack Hits the Target: Is EC Related to Default? 


The third argument is that even if we use such a high CL for default, there is no 
apparent reason to use the same CL for movements of the underlying variables. 
In other words, the default of an AA company in the real world may not be 
correlated, e.g., with equivalently large moves of market variables. This is an 
excellent argument. 

There has not been any real attempt to achieve consistency in the philosophy 
of EC between potential causes of default and the fact that the calculation of 
EC is based on default probabilities. 

Indeed, defaults often seem be caused by liquidity cash-flow problems in 
practice, not capital. That is, the firm misses an interest payment on debt. For 
example in 1998, 123 public corporations defaulted with 66 defaults due to 
missed interest payments ". A firm can have cash flow problems causing default, 
still with plenty of capital. However, lenders would be reticent to provide funding 
if corporate debt is high, which can lead to cash-flow default !'. 

However, it can logically be assumed that a minimum amount of capital on 
the order of EC is needed to avoid default over an extended period if a loss on 
the order of EC occurs. If EC is smaller than losses, problems can arise. So, 
while this third attack is disturbing, it does not kill the high CL approach to EC. 


l! Acknowledgements: I thank Tom Schwartz and Adam Litke for illuminating 
conversations on this and many other topics. 
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What if Economic Capital is Not “Big Enough”? [The 2008 crisis] 


What do we actually do for EC? A measure is adopted, difficult data collection 
is done, calculations are performed, and presentations are given. What happens if 
“not enough” economic capital is allocated? Maybe nothing (if markets are 
stable), maybe disaster (if markets are collapsing). 

“Disaster” can mean increased margin calls and rates, frantic attempts to 
deleverage, panic stock selling as stock prices drop, forced fire sales as asset 
prices collapse with models failing, liquidity vanishing, failure to get funding, 
credit downgrades, and default. Both the sell-side and the buy-side are 
vulnerable, as events in the 2008 crisis "" showed. Systemic risk was rampant. 

I argued vigorously years before the 2008 crisis that Stressed Value at Risk 
should be used to raise economic capital above the low levels of standard VAR. 
It didn’t work". 


Allocation: Standalone, Component VAR, or Other? 


Suppose we are given a calculation of Economic Capital EC for the firm, and 
we accept the results. We still have to allocate the firm's EC between desks or 
business units ( BU ). Allocation is a difficult topic. Several possibilities exist 
that we discuss in turn". See also the Ch. 30 discussion for corporate-level VAR. 


Standalone Risk for Allocation? 


We might want to look at each BU as a separate entity. In that case, we would 


use the stand-alone result EC 89 


for BU,, containing risk from BU, 
positions and intra-BU diversification inside BU, only, but not any inter-BU 
diversification (B U, BU, ) for other business units BU, . 


However, assume that we write the total EC as the sum of the stand-alones, 


ECsumofSa) _ up (39.1) 


12 History: Reaction to my pre-2008 large SVAR Economic Capital: The skies were 
blue, profits were astounding, and risk capital was an annoyance. On presenting my ideas 
to the top risk committee, the head derivatives trader sitting across from me tried to argue 
by saying “Jan, there hasn't been a recession since the invention of modern finance". No 
action was taken. I am proud that I tried anyway. 


? Acknowledgement: I thank Jim Marker and Jack Fuller for helpful conversations on 
this and other topics. 
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Now we have a problem. The sum of stand-alones EC" "^ does not 


correctly asses the firm's risk because it allows no diversification offsets, and 
therefore ECS") > EC . Indeed, the major impetus for much of modern 
corporate strategy is precisely to take advantage of inter-BU diversification. 
There is also a serious consistency problem. Suppose there is an 
administrative reallocation of risks leaving the total risk unchanged. For example, 


take a hedged position with zero risk. Put the long position in BU, and the short 


hedge in BU, . The standalone risks change. Hence, without changing the risk of 


the firm, the total standalone EC" 55^) also changes. 


The bottom line is that Sum of Standalones approach has the virtue of dealing 
with each business unit separately. However, it has the vice of being neither a 
consistent nor a realistic measure of economic capital for the firm. 


Component VARs for Allocation? 


Given the firm's EC including inter-BU diversification, the Component CVARs 
provide a consistent methodology to allocate total risk". Indeed, if we set 


EC, = CVAR, then we are guaranteed by construction that EC = EC, 


consistently. 

The complication here is that also by construction, EC, for ВО, is naturally 
dependent on correlations with the risks included in EC, for BU, . This means 
that a given BU, can do nothing different (or even do absolutely nothing at all) 
and wind up with its EC, being changed due to activities of a different BU, . 

An unusual but possible case is that CVAR, « 0 is negative, implying that 
BU, is hedging out other risk in the firm. Hence £C, allocated for BU, using 


the CVAR approach will also be negative. 

Businesses and upper management want to view each individual business- 
unit risk as due to its own individual activities. While the CVAR approach is 
internally consistent, the sociological problems using CVARs can be non- 
negligible. 


Compromise Recipe for Allocation? 

A possible compromise procedure is to list the total firm's EC correctl 
p promise pros y 

calculated as EC ^ Pei???" With diversification reductions, when dealing 


14 CVAR: The CVAR methodology is described in detail in Ch. 26-30. 
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with firm-wide reporting. Allocation to each BU, is performed using its own 


) 


stand-alone economic capital EC ^ А 


The disadvantage of this compromise is again that the origins of corporate 
strategy of diversification are not present in the allocations. 


The Cost of Economic Capital 


Assume that we regard ЕС as an amount of traditional capital to be kept in 
liquid assets in order to avoid cash-flow problems? , in case of a loss equal to 
EC. Then we can define a cost of economic capital as being related to a “Lost 


Opportunity Spread” 5; opportunity » defined as 


51 ost-Opportunity — Аааа 7 K iquia (39.2) 


This spread is the difference in between the return Ан (that could have been 


obtained by investing EC in illiquid assets) and the smaller return A (for 


iquid 
holding EC in liquid assets). оныд is related to the marginal efficiency of 
capital *, which is the yield earned by the last additional unit of capital (here 
associated with EC ). 


Presumably the spread 5), ооу would be related to the “cost of 


insurance", if such insurance were hypothetically available from a reliable third 
party to cover losses equal to ЕС. 


An Economic-Capital Utility Function 


Consider'^ a firm-wide utility function V. related to return" A and economic 
capital with risk coefficient A, , 


5 Avoiding Loss of Investor or Consumer Confidence: The presence of enough 
traditional liquid capital presumably also serves to retain investor and consumer 
confidence. 


^ Acknowledgement: Santa Federico has many sophisticated ideas for the utility 
function. I thank Santa for helpful discussions on this and many other topics. 


17 Units: The units of the return, the economic capital, and the utility function are 
USD/year. The Sharpe ratio has no units. 


Chapter 39: Economic Capital 551 


Y = R- Apc EC (39.3) 


A firm might try to adopt a corporate strategy that maximizes V for a given 
risk tolerance А; for losses on the order of ЕС. An example of A, could be 


the magnitude of the lost opportunity spread, Apc = ет | 


Constraints on important issues such as leverage limitation, minimal 
diversification, core business requirements, business costs, and sufficient 
-— з " 8 18 
liquidity could be imposed to prevent runaway unphysical solutions "©. 


Firm wide Sharpe Ratio and Economic Capital 


The Sharpe return/risk ratio 5 for the firm could be taken as the utility 


function (the risk-adjusted return) divided by the risk (measured by a form of 
Sum of SA) 


Firm 


Economic Capital). It is most convenient to use ЕС! in the 


denominator", viz 


- ЕС" of SA) 
= (R - Age -EC)/ Ec“ of SA) 


S 


Firm 


(39.4) 


The business-unit BU, Sharpe ratio S, could be similarly defined. For the 
S, numerator, an amount should be subtracted from the BU, return A, equal 


to the charge for EC, based on the firm-wide risk coefficient A,., or on the 


should be used for 


lost-opportunity spread Stost-Opportunity + The standalone E c 6» 


the S, denominator to avoid potential problems with negative (or zero) EC, 
obtained with the CVARs. So S, is 


S, -(&, - 


a | 51 ost-Opportunity 


-EC, | Ec, 8^ (39.5) 


5 When Will We See Calculations Using the Firm's Utility Function? Probably at 
about the same time as the appearance of a real-time movie of firm-wide risk in color. I 
am skeptical. 


19 Downside Risk Measure: Note that Economic Capital is a downside risk measure, 
meaning the downside is penalized but the upside is not. 
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In the limit that BU, is the whole firm, S, — Spim if Age = |51 opportunity - 


Firm 


Revisiting Expected Losses; the Importance of Time Scales 


Standard Assumption: No Uncertainty for Expected Losses 


We have so far taken the conventional definition of EC as involving only 
unexpected losses". Implicit in this definition are two statements: 


e Expected losses C do not depend on the stressed environment. 


Expected Losses 


e Sufficient reserves C over pricing margins cover expected losses. 


Reserves 


The idea behind these assumptions is that expected losses are not risky because, 
after all, they are known and can be dealt with deterministically. 


More Realistic: Expected Losses Do Have Some Uncertainty 


There is a problem with the standard assumption. Think of the time dependence 
of loss as being composed of two parts, a drift and volatility. The expected loss 
acts as the drift. The problem is that, due to the change to a stressed environment, 
the expected loss can change, perhaps substantially, over a one-year period. 

For this reason, the first statement, that expected losses do not depend on the 
stressed environment, is dubious. For example, if we enter a recession 
environment, the average expected losses could increase due to reduced 
consumer demand. The second statement regarding reserves and pricing margins 
may or may not be true, depending on the details”. 

It may be better not to make these assumptions and to write the explicit 


expression for the difference ФС instead 


Expected ? 


? Acknowledgement: I thank Evan Picoult for helpful discussions on this and many 
other topics. 


*! Prophecy? This was written in the 1“ edition, long before the recession of 2008. Of 
course the resulting losses in 2008 were not at all “expected”. 


? Pricing Margins and Expected Losses: The inclusion of expected losses in the 
pricing of goods and services may be problematic in a stressed environment where 
increased competition may exert pressure to lower prices exactly at the same time that 
expected losses are increasing. 
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— C = C reed Environment) (3 9.6) 


Reserves Expected Losses 


oC 


Expected 


The amount ёС е could then be included in a revised definition of EC. 
— 0, then the EC will be unchanged. If 


reserves are greater than expected losses in a stressed environment, then the EC 
will be decreased because these extra reserves could be used. If however reserves 
are smaller than expected losses in a stressed environment or if pricing margins 
dropped, the real EC needed will logically be expected to increase”. 

Therefore, the point is essentially that uncertainty in the expected losses 
should be included in the uncertainty leading to EC . 


If indeed it turns out that C, 


xpected 


Time Scales Are Again the Issue 


The problem lies in the time scales. The EC calculations are usually envisioned 
as being due to short-term “asteroid-like” adverse conditions. Discussions on 
EC in this sense revolve around the length of time for hedging, the amount of 
risk hedged, etc. These topics were discussed at length in the chapter on 
Enhanced/Stressed VAR in Ch. 27. 

On the other hand, the uncertainties in the expected losses are due to longer 
term “getting stuck in the mud" adverse conditions. These can be quite different 
but no less severe. 

In Ch. 47-51, we discuss the Macro-Micro model that incorporates 
uncertainties in macro components of variations of underlying variables over 
long time scales. The difficulties discussed for economic capital here arise from 
exactly the same point. 


Summary for Time Scales and EC 
The high-level summary is that there is not enough attention paid to the time 
scales of risk. The dynamics are completely different for short and long time 
scales. 

It would be more realistic if Economic Capital assumptions and calculations 
would take into account these time scales in an explicit fashion. 


Cost Cutting and Economic Capital 


If EC is regarded as a measure of default in a literal sense, another refinement 
enters. Namely, some fraction f of returns A could be made available to cover 


? Consumer Business: These considerations could be important for risk for consumer 
businesses (e.g. credit cards) where major shocks are unlikely and the main risk 1s slow 
but important degradation due to changing economic conditions. 
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mandatory cash-flow payments and help solve liquidity problems. This involves 
a transfer from spending for variable costs, essentially through cost cutting **, and 
is exactly the procedure followed by corporations with liquidity difficulties. 
Since this amount f - A replaces forced sales of some assets to cover mandatory 


cash flows, it can be viewed as replacing part of the capital needed to avoid 
default, and therefore could be included to reduce ће EC. 

Normally, no influence of returns is present in the calculations for EC. 
Again, this presents a consistency problem if high CZ calculations are used (e.g. 
99.97% for Aa credit) that are motivated by default statistics, while on the other 
hand the dynamics of real-world default involving cash-flow liquidity problems 
are ignored for EC calculations. 

It would be more realistic to change the procedure for calculations of EC to 
make the EC more relevant to real-world considerations. 


Traditional Measures of Capital 


We have been discussing Economic Capital ЕС. To review, EC is calculated 
capital needed to offset unexpected loss for adverse events according to some 
conservative criteria for market, credit, and operational risks. We should also add 
systemic risk capital. As discussed in Ch. 53 we should eventually add climate 
change risk capital. 

On the other hand, traditional capital measures exist, as explained in texts on 
corporate finance. For example, the return on common equity uses common 
equity capital. Common equity capital is defined in corporate finance as common 
stock at par + capital surplus + retained earnings. A closely related capital is book 
value, which is share capital + additional paid-in capital + retained earnings. * 


Traditional Capital is Not Economic Capital 


It is clear that these traditional capital measures are not equal to ЕС. In a sense, 
traditional capital is “capital you have", while EC is “capital you need" to 
survive stressed environments. The connection is supposed to be that enough 
traditional capital is needed to prevent default if unexpected losses on the order 
of EC occur. 

If the management desires a traditional measure of capital to be allocated or 
to be used in Sharpe return/risk ratios, the calculation of EC would not seem to 


24 Cost Cutting and Your Job/Bonus: Cost cutting in finance includes hiring freezes, 
mass layoffs (a periodic Wall Street tradition), bonuses slashed and increasingly paid in 
restricted stock, etc. Careers opportunities go “roller coaster" with the market. BTW, a 
popular management formula for your bonus is the minimum amount such that you won't 
quit. Still, in perspective, jobs in finance are quite rewarding (sure beats Bell Labs, 
which, uh - oh yes, due to superior upper management skills, no longer exists). 
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be of much relevance. We have already seen the difficulties of allocating EC 
itself. It is unclear how to perform allocation of other forms of capital that are not 
involved in the economic capital calculations. 

For example, simple numerical scaling of the allocations by the ratio of book 
value to economic capital is simple to write down, but has uncertain meaning. 
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40. Unused-Limit Risk (Tech. Index 5/10) 


In this chapter, we deal with exposure-change risk as an extension to risk 
calculations and Economic Capital. Most risk assessments use existing portfolios 
and exposures. We do want to gauge the historical accuracy of our risk 
assessments through backtesting. Nonetheless, we are really interested in 
assessing future risk. After all, the future is risky, not the past. Therefore, we are 
(or should be) interested in the risk due to potential changes in risk exposures, 
consistent with limit constraints. In this book, we use a forward + option 
approach in order to model this potential exposure-change unused-limit risk'^?. 


General Aspects of Risk Limits 


In order to discuss exposure-change risk, we need to discuss limits. Considerable 
effort needs to be expended in order to accomplish the various goals and 
activities involving the establishment and the monitoring of limits. 


Types of Limits 
Limits constraining risk exposures that can be assumed by desks are imposed in 
different ways. For example, detailed limits may be set for a given exposure 


i ee) of a given product 5 on a given desk а that depends on the underlying 
variable x, or « for short. For example, we can have a limit on vega exposure 
(È 6 ) for Libor (о ) Bermuda swaptions (5 ) on the exotic options desk (а). 


A limit can be imposed on some measure "s (5) of a product 3 depending 


on several underlying variables, for example the composite notional of Latin 
American bonds. 


' Acknowledgements: I thank Dave Bushnell for insightful comments that greatly 
facilitated this work. I thank Andy Constan for a related discussion. I thank the Market 
Risk Managers at Citigroup for helpful conversations on this and many other topics. 


? History: I developed this unused-limit risk model in 1999-2000. 


? Economic Capital: I believe unused limit risk should be part of Economic Capital. 
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On the other hand, a specific exposure may not have a specific limit. An 
example might be the 10-year AA credit spread risk of industrials, although these 
bonds would be included in more general limits. 


An overall limit may exist on the total exposure 5 e of the variable x, on 


desk a, summed over all product types J. For example, we can consider delta 
for the S&P index on the equity options desk. 

Limits can also be imposed on exposures summed across desks in a division 
$ (Division) _ 


a 


кў sg . An example could be the total spread DV01 across 


aEDivision 


all fixed-income desks. 


Setting and Monitoring Limits 


Setting limits depends on choosing the most important and relevant risk 
exposures, performing risk scenarios or calculations at some level, specifying the 
amount of loss to be tolerated, specifying business requirements, and other 
aspects. Intensive discussions and negotiations between Risk Management and 
the business units may take place to define and to set parameters for specific 
limits. 

Systems need to be constructed to monitor the limits efficiently. Otherwise, 
the monitoring has to be done by hand, which is time consuming. 

The number of exposures used in VAR or other risk calculations can be very 
large. Setting limits on all such exposures would be tedious to monitor and 
counter-productive to impose. For this reason, the number of limits can be much 
less than the number of possible exposures. Generally, some aggregation is used 
in setting limits, e.g. spread DVO1 for investment-grade corporates. Still, the 
collection of limit specifications can produce a large document. 

In practice, as opportunities arise and as portfolios change, exceptions to 
limits may (or may not) be granted. Periodic review and possible resetting of 
limits can occur. 


Example of a Conundrum with Detailed Limits 


We need to be careful in order that the limits do measure real risk. Here is a 
simple example of detailed limits that backfire. Suppose we have limits on two 
buckets #1 and #2 in maturity. For example, bucket #1 could be 0-2 years and 
bucket #2 could be 2-5 years. 

Imagine that initially we have a hedged position with a “calendar spread” in 
bucket #2. For example, we can have a long position at a slightly shorter maturity 
than a short position, both in bucket #2. Assume that either position individually 
would violate the limit, but that together the risks cancel out, giving zero risk in 
bucket #2. Assume nothing is in bucket #1. 
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As time progresses, the long position can move into the shorter maturity 
bucket #1, but with the short position still staying in the longer maturity bucket 
#2. At this point, the limits in both buckets are violated. However, the total risk 
has not changed (modulo possible risks explicitly associated with moving to 
shorter maturities). Therefore, in this case red lights and alarms go off in the 
system monitoring the limits, even though the real risk may be small. 


The Unused Limit Risk Model: Overview 


The model is formulated in terms of fractions of exposures with respect to limits. 
First, we present the model with only one exposure d , and generalize it below. 
Call ° Xe the limit for this exposure and write the fraction of the limit utilized by 
the exposure as * 


dieta e, (40.1) 


If the limits are respected, as we shall assume’, we have f, е <1. Therefore, 


this means there is a barrier at f, =1. 


Exposure Fractions: Time Dependent Decomposition 


The unused limit risk model relies on an estimate of an exposure Sg (t) 
decomposed into a drift term and a volatility term at time t, 
SE (t) = * gina (t) p tae (t) . We divide this decomposition by the limit to 


get a model for the fraction of the used limit, 


* Positive and Negative Limits; the case of Gamma: There can be both positive and 
negative exposure limits, not necessarily equal. If the exposure is negative $E < 0, then 
we choose the negative exposure limit $Lg to define the fraction fg. The fraction fg is 
always non-negative. For example, consider gamma. Only negative gamma is a risk. 
Positive gamma is an asset for which you have to pay. Therefore, the definition is to 
consider only negative gamma exposure with a negative gamma limit. The fraction for 
gamma is still between 0 and 1. 


5 Limit Exceptions, Leaky Barriers, and the Three Strikes Rule: Including limit 
exceptions would be a messy task and involve the barrier at f =1 being leaky or porous. 
This violates my Three Strikes Rule, namely being (1) difficult to have intuition, (2) 
difficult to get parameters and to calculate, and (3) difficult to explain to management. 
For these reasons, refinements of the model may not be as desirable as might appear 
academically. 
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fe (t)= Fe" s fe) (40.2) 


The fraction starts at its current, or spot level Jase The drift term 


/, е (t) over the time period assumed for economic capital gives the forward 
expected or most likely exposure fraction, f;. joa This expected value is to be 


specified by someone who understands the general business strategy and the 
likely behavior of the desk. 


The fraction volatility term f, "i (t) describes the uncertainty df, about the 


forward expected level. We shall discuss the details of this term below. 
The idea is shown in the picture below: 


Exposure fraction volatility around 
the expected forward exposure 


df, at 1 SD 


The Two Components of the Unused Limit Risk Model 
With these two terms specified, the model consists of two related components: 


e A “forward” denoted Є", depending on the expected exposure level from 
the drift. This would be present even if there were no volatility—1.e. certainty 
in the change in the exposure. 
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e An “option” denoted GYC"%" depending on the volatility of the 
exposure level. The desk owns a call option. The option is to increase, if the 
desk likes, its exposure up to the limit. Because of the limit, the option is an 
up & out call barrier option. 


Therefore, the model is 


(mis = ar + gts (40.3) 


The forward exposure can be either above or below the current spot level. If 
the forward level is below the current level, the “forward” proportional to 


| Ў, &fwd T f. E:sp x| will be negative. That means that the desk gets a contribution 


reducing its future risk. See below for more discussion on this point. 

In order to proceed, we need a model for the volatility term. The reader will 
not be astonished to learn that we propose using a lognormal model for the 
exposure vol term, or because the limit is assumed constant, a lognormal model 
for the fraction of used limit. With the forward value of the fraction being 
specified, the model can be cast into the familiar framework of an up-and-out 
European call option with a constant continuous barrier. The option is struck at 


the forward E = fọ. fd » 19 the spot’. The option notional amount * N is the 


economic capital at the limit, * EC 


отима. Decause that is the risk corresponding 


to the desk using its full limit. A risk-free rate r is used for discounting over the 
option period 7. There is also an effective "dividend yield" y,. This is not 
“real”; it is just used to reproduce the forward fraction, viz 


Va = r—|In( f. fud o] Jz (40.4) 


* Why Lognormal Dynamics for the Exposure Fractions? There is some empirical 
evidence that a fraction is reasonably approximated as lognormal (e.g. scatter plots of dif 
vs f exhibiting some linearity). Different behaviors are seen for other exposure fractions, 
including double peaks (at a low fraction during times of substantial hedging and a high 
fraction otherwise). The model uses a mean and width of the exposure distribution. 

It is possible that even if the model were refined, the mean and width would not be 
substantially different, giving similar results. In any case, the model refined along the 
lines of including more realistic exposure distributions would suffer from the same three- 
strikes problem described above. The model in the text reaches a reasonable compromise. 


7 Alternate Model for Unused Limit Risk: An alternate model contains only an up-out 
call option, but struck at the spot or current fraction. The extra Economic Capital from 
this alternate model is always positive. However, this alternate model does not allow for 
the deterministic reduction in risk when, for example, a desk is deliberately pursuing an 
exposure reduction strategy or policy. 
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The barrier fraction level is the maximum fraction H = fs. yay =1. 


The standard up-out call option model is then used *, 


Sc oorr Z $ CT Standard UO Call Formula] (40.5). 


Basket Approach to Multiple Exposures and Limits 


We have discussed a single limit so far. Although the number of limits is less 
than the number of exposures, considerable simplification still has to be made in 


order to get a tractable calculation scheme. To this end, it is convenient to use a 
basket option approach. The most important risky exposures a \ are 


a 


specified, defining the most important risky fractions { Íe H Positive weights 


{we } are specified, with Уи, = 1. The fraction used іп the model is the 
a 


basket fraction, i.e. the weighted sum of fractions, 
fe = Wg fe, (40.6) 
Since each f, & $ 1, we still have f <1 constrained to be below the barrier. 


Illustrative Example for Unused Limit Economic Capital 


Here is an illustrative example. The exposures a | of the exotics desk with 


a 


the Backflip Options portfolio’ are mostly DV01, spread, vega, and FX. The 
Market Risk Manager for that desk, who is intimately familiar with the risk, 


specifies relative importance risk weightings {ws } of 20%, 60%, 10%, and 
10% respectively for these exposures. The current weighted fraction, which 


functions as the spot value, is fg. pot = 20% . The lognormal volatility of the 


fraction is c (df, sf m = 45% for the period of time of the calculation (say 1 


* Standard Barrier Option Model: See the discussion in Ch. 17. The fact that the model 
for unused limits can be cast in familiar form is a distinct advantage in explaining it to 
traders and management. 


? Backflip Options? Recall the amusing but dead-serious practical exercise for the 
reader in Ch. 3, which of course you already did. 
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year). This would be either estimated or determined from the historical utilization 
data. At one SD, the uncertainty in the fraction is + 45% * 20% = +9%. 
The Economic Capital as determined from the spot exposures is 


ЕС = 25MM . With the limits saturated, we therefore would get a result 


spot 


=° 125MM . This result is not reasonable unless it 


is highly likely that the desk will in fact have exposures saturating the limit". 
However the desk exposures are far from the limit and are not likely to reach 

anywhere near the limit. The risk manager determines that the most likely value 

for the fraction at the end of the time period for the Economic Capital (e.g. one 


| $ 
five times larger, ° EC mimax 


year) is f, M 28% . Because this is judged the forward most likely value for 


the exposure, the desk is charged an additional amount (after discounting with 
discount factor DF ) of the first component forward'!, 


$e ov = os - Ўзаро, |-РЕ E ЕС, ans, ~ 7-0% JUEC (40.7) 


LimitMax 


The second component (the up & out call option) describes the uncertainty in 
the risk manager's judgment due to the volatility in the exposures on the desk. 
The option has the strike at the forward, E = 28% and the notional 


$ N = 125MM . With the 45% lognormal volatility, the one SD range of the 
forward fraction is around (19906,3794). Note that with these parameters it is 
highly unlikely that the exposure will get near the maximum level at е. =1. 


The up-out call option in this case is therefore close to the call option with no 
barrier at all. We get 


$ gy UpOutCallOption z 4.6% 3 EC 


LimitMax 


(40.8) 


The Economic Capital from both components due to the unused limit risk is 
therefore 4.6% + 7.6% = 12.2% of the maximum economic capital, viz? 


10 High Limits do NOT Necessarily Imply a Bigger Economic Capital: The model 
resolves this possible problem. If the limits of some desk were to increase but the desk’s 
exposures were not projected to get anywhere near the limit, then the economic capital 
will not increase. This is because the limit barrier is essentially invisible. 


'' Other Parameters: Неге, г = 5% ctn., 365. The “effective dividend yield" using the 
formula was equal to —28.65%. Again, there are no dividends here; this parameter is just 
present to reproduce the given forward fraction value. 


? Alternate Model: The alternate model with only one component (as mentioned in a 
footnote above) produces around 8.8%, rather than 12.2%, for these parameters. 
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Sap edi: x 12.2% 3 EC ~ $ 15MM (40.9) 


LimitMax 


The total Economic Capital for the desk is therefore not $25 MM based on spot 
exposures, but rather 


$ EC mE EC m фетх ы $ 40MM (40.10) 


Tot spot 


Notice that although the Economic Capital has increased substantially due to 
the future exposure considerations, the result is still much lower than the 


= 125MM . 


' $ 
maximum amount ` EC mimax 


EC Can be Reduced if Exposures are Expected to Decrease 
Say that the risk manager had decided that the most likely forward fraction was 
lower than spot Je: fwa < Sesspot: Then it is possible that the forward 


5 €^" <0 could have a larger magnitude than the positive " Ф010!" 


, 


resulting in a reduction in Economic Capital, ° E Cor * `E Cols 


A reduction in Economic Capital for deterministic risk reduction is eminently 
reasonable. For example, suppose that corporate management decides to wind 
down certain exposures. Then future risk will certainly decrease. This sanity 
feature is provided by this two-component model. 


Exposure Scenarios: Comparison to VAR Exposure Reduction 


In Ch. 27, we discussed enhancements to VAR involving scenarios for exposure 
reduction under assumed stressed market environments. The situation here is a bit 
different. The most likely forward exposure scenario, as intended here, is 
supposed to start from the current market environment. If the current market 
environment is not stressed, the current exposure level is not constrained by a 
stressed market, and the forward estimate would be made under normal business 
conditions. 

In a time-dependent simulation, the example presented in the text would have 
the current exposure fraction of 20% under normal conditions estimated to 
increase to 28% under normal conditions. Then, if an “asteroid” hits the market, 
the desk would presumably start at some later time to reduce exposure in that 
stressed market environment. This could all be treated explicitly if we used time- 
dependent simulations. The present model just approximates the effects using a 
simple add-on procedure. 

Consider the drawing below, which should illuminate these ideas: 
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Two-step Exposure dynamics before stress 
event (asteroid) and during stress period. 


Exposure fraction 
volatility, fluctuating 
around expected forward, 
before stress event starts. 


Deliberate exposure 
reduction (after decision 
time, during liquidity time). 


/, é;spot 


Stress Event Stress 
Starts Period Ends 


Unused Limit Economic Capital for Issuer Credit Risk 


The above model focused on market risk. Exactly the same formalism in 
principle can be used for unused limits for issuer credit risk. 

Credit limits can be formulated by geographical region (e.g. Latin America), 
industry (e.g. industrials), credit level (e.g. high yield), specific issuer (e.g. GM), 
or any other criteria used to classify bonds. 

The exposure corresponding to a given credit limit would have risk due to the 


same sort of decomposition, 5 &(t) =. © geet (t) phg (t) . The drift 


component would be specified as the most likely forward credit exposure, and the 
volatility component would be specified as credit exposure fluctuations about the 
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forward credit exposure. Again the credit model would consist of two 
components, the credit forward and the knockout UO credit call option. 

The parameter estimations for the credit unused limits could follow a similar 
procedure to that explained above for market risk. 


PART V: PATH INTEGRALS, GREEN FUNCTIONS, AND 


OPTIONS 


567 
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41. Path Integrals and Options: Overview 
(Tech. Index 4/10) 


In previous chapters in this book, path-integral techniques were in fact used 
repeatedly for valuation. In this part of the book, we deal directly with the 
formalism of path integrals as applied to finance. Those who already know path 
integrals and who want to jump-start into finance might start with these chapters. 
The finance discussion is self-contained. For those who are unfamiliar with path 
integrals, the presentation will have appropriate background material. 

At the same time, because path integrals are in fact Green functions, the 
discussion will be relevant to the Green function approach to options. 


Path Integrals and Physics 


Feynman’ developed path integrals as a technique used in his Nobel-prize 
winning work related to relativistic quantum mechanics. Path integrals constitute 
a powerful and elegant framework for treating problems containing random or 
stochastic variables. This framework is very general. There is a long history of 
path integrals applied to practical problems in physics. 


Path Integrals and Finance 


Path integrals applied to finance provide a powerful, understandable approach. 
Path integrals are useful for options. This is because finance theory involves 
diffusion equations, based on assumptions using random variable models for 
interest rates, stock prices, exchange rates, etc. The diffusion equation is solved 
directly and exactly by a path integral ^. 


! Feynman Story: Everybody likes to tell stories about themselves and Feynman. Here is 
mine. I screwed up my courage and knocked on his door, but he was out. When he came 
back he saw me sitting on the floor and said “Who are you?" I said I had some ideas 
about diffraction scattering. He said “Well I guessed wrong on that one”, invited me in, 
and spent 2 hours discussing physics with me, on a topic that was not his main interest. 
He brought up every issue related to the details of the topic. I will never forget it. 


2 Relation to Quantum Mechanics and Rigor: Some finance professors have 

erroneously concluded that the path integral approach to finance is not rigorous, perhaps 

misunderstanding the difference here with quantum mechanics. The diffusion equation is 

mathematically simpler than the Schródinger equation, having solutions with no 

oscillations in time. The Schródinger equation can be turned into a diffusion equation 
569 
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The following points are basic: 


e An explicit feature is the picturesque idea of future paths of the underlying 
variables as time progresses. This increases physical intuition. 

e The fundamental idea of a contingent claim being equal to the expectation 
value of the discounted terminal value, consistent with boundary constraints, 
is manifest. 

e Complicated obscure mathematics is avoided. The reader may be comforted 
to know that mathematics background for most analytic calculations with 
path integrals requires only an ability to perform Gaussian integrals by 
completing the square. Schwartz distributions (Dirac delta functions) connect 
the stochastic calculus with path integrals in a straightforward fashion. 

e The no-arbitrage conditions are implemented simply by specifying drift 
parameters in the path integral through external constraints. 

e Consistency is obtained with the standard no-arbitrage hedging recipes. 

e Path integrals can be evaluated numerically, e.g. using all the usual 
techniques, including binomial (or multi-nomial) discretizations, grid 
discretizations, and Monte-Carlo simulation. General path-integral 
discretization, pioneered in finance by Castresana and Hogan, provides an 
efficient and flexible numerical approximation technique. 

e Generalization to N dimensions for applications to multi-factor models is 
straightforward. 


A Few Basic Details of Path Integrals 
The path integral evaluates the consequences of fluctuations of random variables 
x. (t) as a function of time ¢. The probability distribution function (pdf) of the 


fluctuations d,x, (t) EX. (t + dt) =, (t) has to be specified. Averaging or 


finding expectation values of some quantity C then involves merely integrating 
C times the pdf over the possible values of the random-variable fluctuations as 
time progresses. This is just the path integral. 

The path integral gives the propagation of information in time by consecutive 
small (or in the limit, infinitesimal) time steps of size dt in such a way that the 
underlying diffusion equation is manifestly satisfied at each step. Each such small 
step is accomplished by including a "propagator". The path integral in fact is the 
Green function solution to the underlying diffusion equation. 

It is important to understand that the path integral is essentially just a 
convolution of standard-calculus integrals. 

A standard physics approximation consists of the WKB semi-classical 
approximation. In finance, (up to a convexity term) this starts with a deterministic 


through a so-called Wick rotation. Issues of rigor (uninteresting as they are), are basically 
nonexistent in finance relative to quantum mechanics. 


Chapter 41: Path Integrals and Options: Overview 571 


forward path of stock prices, interest rates, etc. depending on the application. The 
size of the fluctuations around the forward path is measured by volatility. 

In simple cases, the path integral can be evaluated explicitly. Whenever an 
analytic solution exists, it can be derived using path-integral techniques. 
Discretization provides a natural base for numerical approximations. 


Summary of the Chapters on Path Integrals and the Reggeon Field Theory 
The four chapters on path integrals and one on the Reggeon Field Theory are: 


e Ch. 42: Path Integrals and Options I. This chapter presents an introductory 
overview of the use of path integrals in options pricing including some 
pedagogical examples. The connection with stochastic equations is exhibited. 
We give a transparent proof of Girsanov’s theorem. We deal with no 
arbitrage and hedging in the language of path integrals. Finally, we give 
some results for local volatility in perturbation theory. 


e Ch. 43: Path Integrals and Options II. This chapter contains the path-integral 
framework for one-factor term-structure interest rate models, including 
Gaussian and mean-reverting Gaussian. Results for general models including 
arbitrary rate-dependent volatilities are given. Models with memory effects 
are also presented. It is shown explicitly how the stochastic equations for rate 
dynamics are built directly into the path integral. 


e Ch. 44: Path Integrals and Options III. This chapter presents some aspects of 
numerical methods for options based on path-integral techniques. The 
fundamental path-integral discretization method originated by Castresana and 
Hogan is described. We emphasize that standard binomial and multinomial 
approximations, Monte-Carlo simulations, etc. are just techniques for 
evaluating the path integral. We also discuss Smart Monte Carlo, American 
Monte Carlo, and calculations of Greeks within the path integral context. 


e Ch. 45: Path Integrals and Options IV. This chapter presents options in the 
presence of many random variables, including principal component path 
integrals. 


e Ch. 46: Reggeon Field Theory (RFT). This chapter contains applications to 
finance of this theory of nonlinear diffusion. It is actually separate from path 
integrals, but because it arises from physics I have included it in this section. 
The RFT is soluble under certain conditions, and can produce non-Brownian 
critical exponents and scaling laws, calculable in certain approximations. We 
translate the RFT into finance language in a direct way. Since the 1* edition I 
have found results across markets in crisis that on average are in surprising 
relation to RFT theoretical calculations done long ago, without any parameter 
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fitting. Moreover, scaling exponents can be used to construct a model for the 
probability of entering into a crisis. The model results are much better than 
chance for the probability of crises within a year for equity markets, shown 
by backtesting. I believe these results are highly significant and show a path 
forward that may be profitable for further exploration by others in the future. 


42. Path Integrals and Options I: Introduction 
(Tech. Index 7/10) 


Summary of this Chapter 


Path integrals are widely used in physics for treating problems with stochastic 
variables. In particular, diffusion equations have path integrals as exact solutions. 
This chapter presents an introductory overview' of the use of path integrals in 
options pricing. ^?. We begin with European stock options, the venerable Black- 
Scholes model. We exhibit how Bermuda and American options fit into the path- 
integral framework. A list of references is at the end of the chapter ^". 

Green functions and semigroup techniques are used throughout. The 
connection of path integrals with stochastic equations is exhibited explicitly. We 
also give a transparent proof of Girsanov’s theorem. Finally, we deal with no- 
arbitrage, with which path integrals are fully consistent, and discuss hedging in 
the language of path integrals. 

Elsewhere we apply path integrals to term-structure interest-rate models (Ch. 
43), barrier options (Ch. 17-18), 2D options (Ch. 19), etc. 

Mathematics background for most of the material in this chapter will not 
require much more than an ability to perform Gaussian integrals. Facts regarding 
differential equations, Fourier transforms, and Dirac delta-functions will be 
explained as needed. 


' History and Acknowledgements: This chapter is largely taken from the first paper in 
Ref.i, and is based on work done in 1986-1987 as a consultant to Merrill Lynch, while on 
leave from the French CNRS. I thank Santa Federico for asking the question to establish 
the connection between stochastic equations and path integrals that landed me on Wall 
Street. I also thank Andy Davidson and Mike Herskovitz for support during this time. 


2 Already Know About Path Integrals? Already Know the Models? Those familiar 
with path integrals will find the discussion of path integrals trivial; they should focus on 
the finance. Those who already know the finance should focus on the path integral 
formalism. Very few people know both well. 


? To the Quants: Don’t freak out. You already know something about path integrals: 
Anybody who has done Monte Carlo simulations, constructed binomial lattices, solved 
diffusion equations using analytic methods etc. has essentially been using path integrals. 
Hopefully, the general framework and connection between these ideas will become clear. 
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The reader is assumed somewhat familiar with the financial models ", though 
for convenience in the presentation we shall include enough finance information 
to keep the discussion self-contained. 


Introduction to Path Integrals 


Feynman developed path integrals as a technique used in his Nobel-prize winning 
work related to relativistic quantum mechanics ". Further work by Кас " and 
others, along with many applications to physics’ soon followed. Path integrals 
provide a powerful, elegant framework for treating problems containing 
stochastic variables. Any diffusion equation has a path integral solution ". 

The special case of the path integral that we will be using is also called the 
Wiener integral "'. 

Standard options pricing models involve (backward Kolmogoroff) diffusion 
equations, based on considerations using random variable models for interest 
rates, stock prices, exchange rates, etc. These can be Brownian, possibly with 
mean reversion "", or other. 

In general, the path integral is useful because it affords a natural framework 
to visualize physical situations and to carry out calculations. 

The path integral has proved useful as a realistic calculation tool in finance. 
In simple cases, e.g. the Black-Scholes model (a free-diffusion Gaussian model), 
the path integral can be evaluated explicitly. Somewhat more generally, path 
integrals can be analytically evaluated with Gaussian dynamics if the boundary 
conditions and parameters are simple enough. This is because in that case, simple 
consecutive convolutions of Gaussians occur, and the result is again a Gaussian. 

When the path integral cannot be evaluated analytically, standard numerical 
methods are used. These include Monte Carlo simulations, binomial (or 
multinomial) approximations, etc. PDE solvers of diffusion equations fit in also, 
because as we just said the path integral is the solution of the diffusion equation. 

The formalism of path integrals applied to options is known to some 
members of the quantitative-finance community—see especially the early work 
of Geske and Johnson” ^. One aim among others of this introductory chapter on 
path integrals is pedagogical, in order to make path integral concepts comfortable 
to the reader by using explicit examples, and in order to emphasize the generality 
of the approach. 


Basic Idea of Path Integrals 


The basic idea of a path integral is the propagation of information in time by an 
infinite set of infinitesimal time steps in such a way that the underlying 


^Congnoscenti: Steve Ross tells me that he knew path integrals were relevant. There may 
be others that also had this realization. Still, from my experience, the path-integral 
formalism is not well known in the general finance community, even at this late date. 
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differential equation is manifestly satisfied at each step. This idea is illustrated” in 
Fig. 1 at the end of the chapter. Information about the option exercise value at the 


expiration or strike date / * is propagated backward to the present time f, by a 


series of consecutive time steps At < 0, where Af is finite. We also use the 
notation dt > 0 that will always be infinitesimal. At the end of a calculation, dt 
and/or At may taken formally to zero to get the continuous limit. 


The Propagator or Green Function 


Each such small At step is accomplished by including a "propagator", which is 
the transition probability density or Green function solution to the underlying 
diffusion equation over time Лі. Paths are thus generated in the co-ordinate x- 
space between the present time /, and time ¢*. The dependence of x on the 


financial variables is specified by the model. For example, in the Black-Scholes 
(BS) model *, x is the logarithm of the stock price on which the option is written. 
Each path is associated with a probability measure or weight specified by the 
model', and all paths are summed over by the path integral. Probabilities for 
different paths may or may not exhibit a degeneracy, depending on details like 
whether the volatility and other parameters are or are not x -dependent. 


The Path Integral is Just a Convolution of Ordinary Integrals 

The path integral is actually a functional ". That is the path integral depends on 
paths in co-ordinate x space. The paths themselves are functions depending оп 
the time. Thus, the path integral is not a standard integral, but rather a large 
multidimensional integral (formally infinite dimensional in the limit At > 0), 
consisting of a convolution of ordinary integrals. 


? Figures: The numbered figures, taken from Ref. 1, are at the end of the chapter. 


ê Notation: At and dt: At in this discussion is a time interval that may remain finite, 
while dt is always eventually to be taken infinitesimal. Formally, we can set At = —kdt 
for some К, and when dt — Owe let k — oo to keep At finite. Sometimes we will take 
At — 0. The circumstances will always be made clear. 


7 Weights and Probabilities for Paths: In a Monte Carlo simulation of a path integral, 
each path generated has weight — 1. However, the probability of generating given path 
bunches passing through a set of “bins”, depends on the model probability distribution 
function. It can be useful for numerical approximation to group together paths in bunches 
or “effective paths”, which are then associated with appropriately integrated probabilities. 
For more discussion, see Ch. 44. 
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Approximations to Path Integrals 
A standard approximation consists of searching for a "best" Gaussian 


approximation. This is usually either the WKB semi classical approximation " or 
a non-interacting free diffusion approximation. One can then perform an 
expansion around the Gaussian in a perturbation series. If the resulting 
fluctuations are small as measured by some small parameter, the perturbation 
series can prove to be quite useful numerically whether or not it converges 
formally, as exhibited by Quantum Electrodynamics “. 

In finance, the classical approximation is replaced by fluctuations about 
deterministic forward quantities. The size of the fluctuations is measured by 
volatility. In general, either the formalism is simple enough so that analytic 
solutions are possible or else numerical techniques are adopted. Perturbation 
theory itself is usually not performed. 

In a more general setting, the discretization of the path integral itself provides 
a natural base for reasonable numerical approximation to a theory when analytic 
results are not available". Monte-Carlo methods constitute a popular tool for the 
numerical evaluation of difficult path integrals *". 

In the best of cases, there is a preferred path (or set of paths), about which 
fluctuations are small. 


Heretical Remarks on Rigor and All That 
We use notation close to that of Feynman ™. For a straightforward presentation 
and in the spirit of Ref. [iii], we shall not follow an unprofitable mathematically 
rigorous development, which is not required for applications ? xi" Appropriately 
sophisticated analysis is performed when needed *”. 


* Numerical Path Integral Applications — the Castresana-Hogan Approach: Juan 
Castresana and Marge Hogan have shown how to discretize path integrals explicitly in 
practice. The numerical methods based on this discretization are reliable, flexible, and 
fast. These methods have been used in production on the desk for Bermuda swaptions, 
among other products. This approach is discussed in Ch. 44. 


? Too Much Mathematical Rigor in Finance? YES! The whole path-integral discussion 
can, 1f desired, be put on a much more mathematically rigorous basis. Path integrals in 
finance are simpler than for quantum mechanics because there are no oscillations, making 
the theory a-priori mathematically well defined. See Glimm and Jaffe, Ref, p. 44. No 
errors are made using the path integral applied to finance. 

However, following Feynman (Feynman and Hibbs, ref. p. 94), it is difficult see the 
utility of a full-court press for rigor when financial models are only approximate, 1.e. 
various assumptions behind the models are manifestly violated in the real world. 

There 1s, moreover, a serious case against too much mathematical rigor in finance. 
Rigor can hide irrelevance. Rigor teaches us nothing new of practical importance. Rigor 
can be counterproductive because it makes the subject appear harder than it really is. The 
worst is that rigor gives a false sense of model validity. 
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The Rest of this Chapter 


The organization of the rest of this chapter is as follows. In the next section, the 
Black-Scholes model is discussed in some detail for orientation, followed by the 
inclusion of dividends. For generality, we allow arbitrary dividends, even 
stochastic and time dependent. Then, we give the general form for options with a 
multiple (put/call) schedule, and for American options. 

Appendix 1 presents two straightforward and related derivations of the 
Girsanov theorem "' using path integrals. One demonstration follows directly 
from the incorporation of the stochastic equations as delta function constraints in 
the path integral, and then carrying out the change of variables explicitly by hand 
to isolate the terms involving the drift”. 

Appendix 2 contains a discussion of no arbitrage, hedging and path integrals. 

Appendix 3 contains calculations using a local volatility and perturbation 
theory. See Ch. 6 for an introduction to local volatility and skew. 

In Ch. 43 and 45, we present a discussion of stochastic interest rates and 
options that depend on several stochastic variables. A picture of the 2- 
dimensional case is in Fig. 7 at the end of the chapter. 


Path-Integral Warm-up: The Black Scholes Model 


The reader may already be familiar with the Black-Scholes (BS) model. The goal 
here is partly to put old wine in new bottles and to exhibit the path integral 
formalism for those who are unfamiliar with it. We start with demonstrating the 
compatibility of path integrals with the standard “no-arbitrage” framework. At 
the end of the section, we re-derive the same results using a more compact and 
more straightforward approach in which “no-arbitrage” appears as a simple 
parameter specification. Appendix 2 contains more no-arbitrage details. 
Therefore, we begin with stock options. Similar models are used for FX 
(foreign exchange) options, commodity options, and some other types of options. 


The use of excessive rigor in finance parallels physics in the 1960’s for the 
mathematically rigorous axiomatic field theory. One paper (Gell-Mann et. al, Ref.) put 
the situation in perspective: “In particular, the contribution of axiomatic field theory to 
calculations has been less than any pre-assigned positive number, however small”. 

Nonetheless, I repeat that the application of path integrals applied to finance can be 
made as rigorous as you like. 


The Stochastic Equations are in the Path Integrals: For details, see the end of this 
chapter, Appendix 2, and also the next chapter “Path Integrals and Options II" 
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Textbook Discussion in a Path Integral Framework 
The usual discussion! starts with assuming that N, shares of stock with 
stochastic price S(t) per share at time ¢ are contained in a portfolio along with 


Ме options with price per option С(5,/) . The portfolio value V is 
V = N,S(t) + NcC(S,t) (42.1) 


In order that there is “no arbitrage” , the return of V has to be the same as 
holding risk-free securities", so V is assumed to satisfy 


Е 


ang 42.2 
x (42.2) 


Here 7, is the risk-free interest rate (presumed constant for simplicity)". We 


define the volatility o,, also held constant in time for the moment". We assume 


'' More No Arbitrage and Hedging: See Appendix 2 for a general approach discussing 
no-arbitrage and hedging. 


? No Arbitrage Warm-up and a Joke: The reader might argue that no arbitrage is 
nonsense. If one cannot do better than buying treasuries that produce a risk-free rate, why 
would people go to the trouble of buying options and dynamically hedging them with 
stock? Why should we assume that time-averaged stock returns are equal to the risk-free 
rate, when we all know that stock is riskier than debt, so the stockholder deserves a 
greater return than the bondholder (who because of corporate credit risk, already receives 
a coupon above the risk-free rate). Nonetheless, options are priced using no arbitrage. The 
answers to the questions are what you need to understand to become a quant or a trader. 
Here 1s the no-arbitrage joke. The professor and the trader are walking along when 
they both spot a $10 bill on the sidewalk. The professor says, "This is impossible as 
demonstrated by no arbitrage; it must be a mirage". The trader picks up the $10. 


P? The “Risk-Free Rate": This rate is assumed constant in this section. It is actually 
specified over a time period relevant for the option. The type of rate is not unique. It can 
for example be taken as a treasury rate, Libor, a cost-of-funds rate based on Libor plus a 
spread, Fed Funds, a stock rebate rate, etc. Libor is the Street standard. The appropriate 
Libor rate for a given option is obtained by interpolation from the Eurodollar futures and 
swaps markets. For FX options there are two interest rates — the “domestic” and the 
"foreign" rate that must be considered. See earlier chapters for details. 


^ A Little Essay on Volatility: For those readers starting the book here, we give a 
practical synopsis of volatility. Relaxing the constant volatility assumption 1s one of the 
central complicating features of options pricing and hedging. The volatility is taken as 
different for different times (“volatility term structure"). It may also include stock-price 
effects to produce “skew”, needed to match market prices of options with different 
strikes. The volatility is sometimes taken as obeying a stochastic equation, with a 
“volatility of volatility” describing fluctuations of the volatility itself. The volatility takes 
significance from the model in which it is defined, and models are not unique because no 
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the lognormal stochastic equation dS/S = u,dt+o,dz(t) with the Wiener 


measure satisfying ^ '° CAO = dt. Next, we use the expansion for the full 


time derivative of the option value С, 


dC ôC óCd$ 10°C (dsy 
dt ôt OS dt 208^ dt 


(42.3) 


The Option Diffusion Equation 
Setting the hedge ratio №, / №. = —0C/0S cancels out the stochastic quantity 
dS/dt . This produces the diffusion equation for C(S,t) as!” 


model describes the statistical properties of the underlying variable except in 
approximation. 

In practice, the option volatility is backed out from interpolating values of the 
volatility needed to obtain agreement with options trading in the market; this defines the 
“implied” volatility. Only a small fraction of possible options actually trade — and your 
option may not trade at all - so the implied volatility may be an interpolated or 
extrapolated quantity. 

The implied volatility is usually compared with the volatility of the stock price 
observed in the past (the “historical” volatility). It is often said that the implied volatility 
is the market’s estimate of future historical volatility, and this is the assumption made in 
the option pricing formalism. However there are all kinds of technical issues affecting 
option prices, and therefore affecting implied volatilities (option supply/demand being an 
example). Therefore, it is hard to know to what approximation this association is true. 
Another complication is that the value of the historical volatility depends on the size of 
the data window. 

Traders naturally hedge options with stock, and therefore the relation of the implied 
volatility to the historical volatility forms an obsessive topic in determining whether 
trading makes or loses money. Sometimes the stock of the hedge is the same as the stock 
(or index) on which the option is written, but often for practical reasons it isn’t. 


5 Path Integrals and (dz(t))? = dt: This is actually just a statement of the width of 
individual Wiener measures that begin the path integral approach. There is nothing 
mysterious about it at all. This equation is valid for the expectation value, not for some 
individual pick of a random number, of course. 


е Brownian Motion Limitations: The infinitesimal limit dt — 0 with the expectation 
(dz(t)) = dt assumes that a Brownian-motion diffusion random-walk process of the 
underlying stochastic variable occurs down to the smallest time scales. In the real world 
this idealization of infinitesimal time scale price changes cannot occur (not even a 
computer can react in one picosecond, and people go to sleep sometimes). Following the 
standard literature on options models, we temporarily ignore this problem along with 
other issues of importance, such as discontinuous jumps, possible feedback non- 
linearities in the options price itself, effective phase transitions from disordered to 
coherent actions among investors, etc. 


17 Extension of the no-arbitrage derivation: See Appendix 2. 
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(42.4) 


Here x = In($) acts as a co-ordinate’, while My =h loo functions as a drift. 
The stock-specific drift ш, does not enter since dS/dt terms cancelled. Also, 
Ito's rule (or alternatively the need to obtain the same diffusion equation under 
change of variable) means the stochastic variable satisfies dx = dS/S — оса, 

д A ©. ar 


while 2 = S2 and == 8° — 
Ox os Ox Ф os 


as ordinary variables. 


Solution of the Option Diffusion Equation 
The solution to this equation is classic. Let us take a moment to recall its 
derivation. Write the formal Taylor expansion for a time step At as 

C(x,t + At) =e C(x, 1) (42.5) 
Here, д, = — дуд, - 1070, , where 0, -0/0x, 02, = 0^ /Ox?. 0, = 0101. 


xx? 


We continue using Fourier Transform (FT) methods х“. We set 


C(x,t) = [ 2 еб) (42.6) 


Note that д, = ik , while 0°, = —k^ when operating on exp(ikx). We get 


18 The logarithmic change of variable and the Ito, Stratanovich prescription s: The 
Black-Scholes discussion could proceed using the stock price S instead of its logarithm x. 
If this is done, we need quadratic terms in the expansion of the return of the stock price. 
Setting S; = x(tj) using a time discretization, we have 


2 
= x 1 
(Sj, -8,)/s, - XO EON -x,)-1 R (жы x,)+ (хы à] 
in order to reproduce the results using x. The quadratic term 1s replaced by its average, 
Go dt/2, which is valid as dt — 0. Keeping this quadratic term is equivalent to the Ito 
prescription (which we follow), while dropping it is equivalent to the Stratanovich 
prescription. To emphasize it again, we use the Ito prescription. 
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C(x,t+ А) = е® ae es At -iu k - ог? |+ikx}-C(k,t) (42.7) 
2л 0 250 


—00 


Note that the integral makes sense only if Af < 0, since the integral must 
converge at k — +оо . Hence, we will be propagating information backward from 
some boundary condition given in the future at time ¢* (for European options, 
this is the strike date or expiration date). Write the inverse Fourier Transform 
formula at t=¢*, 


C(k,t) = f dx exp(—ikx”) C(x" ы) (42.8) 


m^ 


Set At = t — t*, x, = In S(t, ),x* = In S(r*) . This produces 


С(х,,4) = f dx G, (x -x ;At) -C(x",t*) (42.9) 


The Free Green Function or Propagator 


Here, the "free propagator" Green function С, (si — x"; Ar) is? 


В 2 
AA E -x - ut | 


exp 
| -220; A] 205 At 


Here ө(-л) is equal to опе for At < 0, zero for At > 0, and 1/2 for At = 0. 


G; (x —x ;At]- 6(-At) (42.10) 


The definition С, =0 for Ar» 0 has been made for convenience while the 
result for At < 0 follows from standard Gaussian integration in ЖК. 
Note that G, (x — x*; At) is a function of | x, — x * uj At | /4/| At |, which 


| . А А ‚20 
is the canonical Brownian motion, square-root scaling". 


P? Notations for the free propagator: Go , Сг and sometimes just С are interchangeable 
notations in the book. 


? Non-Brownian Scaling Models: Other possibilities for scaling involving powers other 
than '^ are possible in non-Brownian dynamics. However, these models are difficult to 
work with and difficult to understand. The practice on the Street is to use Brownian 
motion with various parameters fit to the market, warts and all. We discuss these 
considerations in Ch. 46 when we discuss the Reggeon Field Theory. 
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The terminal or exercise-date boundary condition for a call option with strike 
price Ё is 


C(x’) = (e* -E) - (e* - EJe(e* -E) (42.11) 


The Classic Black-Scholes Formula 


The usual Black-Scholes (BS) model result for a call option is then obtained by 
straightforward integration"! 


C(X sty) = S(&)N(d,)- Ee ™ N(d ) (42.12) 


Неге т =—At is the positive time difference from valuation to expiration”, and 
the “d functions" are **”* 


d = {In[S(¢)/E] + ме) (ст) = а [5009/5] +(x, -oi /2) r} for)” 


4, 24 (eir) = (m[s/E] s (n + о /2)2} (ос) (42.13) 


The standard normal integral is 


?! Normalization: There is often an additional normalization factor to convert the 
equations to real prices. For example, S&P index options have a multiplier of 
$100/contract. 


? Complexity and times in the real world: To give an idea of real-world complexities, 
in practice there are several different times used. The “diffusion time” тағ may be used 
for the time from valuation to the date the option decision for exercise is made, the 
“discounting time" тас may be used for the time from valuation to the date that cash is 
actually paid out for the option exercise, etc. Sometimes even fractions of a day are 
included which the options quant will tout as being “more accurate", although given the 
uncertainties in the volatility this seems like splitting hairs. This sort of detail can be 
particularly annoying if you need to reproduce the results of a black-box model whose 
details are unknown (e.g. the model developers have disappeared). 


? Notation: the *d" functions: These functions are ubiquitous in standard options theory 
because they result from the Gaussian integrations over the limits specified by the options 
constraints. Another common notation is dı = d+ , d; = d. . 


? Probability (Stock price above strike): This probability is equal to d; = d. 
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v()- Í a ехр(—и?/2) (42.14) 


V27 


Handy Integrals 
Some useful integrals to avoid the calisthenics of completing the square are: 


Š 
max d 2 2 
I, = 1 = exo S| = (ео )уө) (42.15) 


Š 2 2 
— ia dé p = ol 0 _ ay a V max ay 


(42.16) 
Here, V max = Ce -ay[2) ү 2 ` 


Sometimes the parameters are taken differently for physical reasons. A useful 
integral in that case (with all time intervals 7 > 0) is 


" 2 
E E + дуть) 


2 
20.7 
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С(х.ь)=е «^ dx 
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Here 4, =r, -lo;r,, while d = {In[S(,)/E, |+ о, / (ov y "T 


c € 


а =а + (сг, 2 


The reader should carefully note that the main point of emphasis is not this 
standard textbook option result, but rather the more general importance of the 
path integral formalism and the free propagator G, . 


The Semi-Group Property 
The Green function or propagator С, satisfies the "semi-group" or "reproducing- 


kernel" property as can easily be seen by direct integration or FT techniques 
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Note that the same function G, appears on both sides of this equation. 
Physically this means that free propagation from (23%, } to (x Ji) followed by 
free propagation from (1.4) to (xot) integrated over all x, , is equivalent to 
free propagation from (x,t) directly to (xt) as illustrated in Fig. 2 at the 
end of the chapter. 


Building up the Path Integral from the Semi-Group Equation 

It is important to recall that the definition of simple European options solved by 
the BS model does involves integration over all x-values for all times between 
the present and the expiration date. In general a European stock option allows 
any intermediate stock price with f є (i.f *) with S(t) = exp(x()) from 0 
to оо in principle, since the investor cannot exercise a European option at times 
before ¢* by definition regardless of what the stock price is. Now using the 
semi-group property for G, we may iterate an arbitrary number of times, 


obtaining"? 


n-l n-l 


G (x, -x*,-1*)- | ds, Пс, (х, X uil, tu] (42.19) 


ja j'-0 


? What happens if the integrations have constraints? If constraints on the integration 
over intermediate states exist, the discussion becomes more complicated. If the 
constraints are simple enough, closed-form solutions can still be obtained. This is the case 
with simple “barrier” options as described in Ch. 17-19. Under fairly general conditions, 
the iterated path integral satisfies the semi-group formula. With constraints inserted at 
intermediate times, general options can be evaluated using numerical approximations 
(Monte Carlo simulations etc). 


? Notation for Labels and Indices in Path Integrals - README! For this equation 
only there are big brackets and distinguished indices j and j’. In general, by convention 
the brackets and the different labels are to be understood and will not be exhibited. The 
extension of a dummy index labeled j in the first product is intended to extend only 
locally in the formula only up to the second product with a separate dummy index which 
can be labeled by the same letter j. This convention is common practice in physics papers 
and avoids cluttering up the page with bracket signs and a plethora of different labels. 
Properly understood, there should be no confusion. 
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There are n —1 integrals and п propagators. Here, x* = х, and x, = x(t » for 


as illustrated in the Fig. 1 at the end of this chapter. 


n? 


t, between f, and t* =t 


Formal Continuous Limit for the Path Integral 
The formal continuous path integral is defined by the и — o» limit of the 
successive propagations with fixed t * –/,. Defining the "velocity" dx(t)/dt as 
dx(t) — 

E x =% ba -1,) at (zat) as f, — t, — 0, and substituting the 


Gaussian form for local propagation forG,, the path integral for the Green 


function for propagation over the whole interval is written as 


* * E » dt dx(t) : 
G,|x, 7x ;tj-t J=e ° Dx(t) |ехр | ^| 
i | : | | seal l ) hz dt ° 
x to 7x9. 
x r*-r* 


(42.20) 


The option price C(x,,f,) as a path integral is obtained by inserting the path 
integral expression for С, into Eqn. (42.9). The above equation for С, is in the 


standard Lagrangian form of the path integral. 

So far, it might seem that we have merely succeeded in somehow rendering 
the simple BS model much more complicated. However, the path integral 
formalism that we have presented 1s fundamental. The simplicity of the BS model 
containing just free propagation allows the path integral to be evaluated in a 
trivial fashion, simply by undoing the steps leading to Eqn. (42.20). 


More General Parameters and the Path Integral 

Now consider replacing the constant drift 4 by a general price and time 
dependent drift function u(x,t). This is, for example, produced by the general 
dividend model, which can produce jumps, and requires the path integral 
apparatus or an equivalent approach. Similarly, allowing the volatility o, to 
become a function gi) in order to include “skew” effects likewise requires 


the path integral. The American option restricts the class of paths in a non-trivial 
way involving some complicated optimization logic, and again necessitates the 
path integral. 
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In each of these cases, every propagation is forced to occur in infinitesimal 
At . At the end of the section, there is a diagram to illustrate this point. 

Convolution of these propagations to get a finite-time propagator in the 
general case cannot be evaluated analytically. The semigroup property is true, but 


now holds only for the path integral itself if finite time intervals £, – 4, f, — f, 


are considered. Since, for the BS case, the path integral itself is just the free 
propagator with modified parameters, the result is trivial to obtain. 


The Path Integral Satisfies the Diffusion Equation 
It is profitable to see how the path integral solution for C(x,f) satisfies the 


diffusion differential equation (42.4). We first start with the free diffusion 
equation with constant parameters. By direct algebra it is not hard to see that the 


Green function G, satisfies the following equation 


[ 8, + 440, *1o 07 - n |G, (x-xst-1) 2 -8(x-x')ó(t- t) 42.21) 


X) 


Dirac Delta Functions 
Here ó(c) is the Dirac delta-function "", mathematically a Schwartz 


distribution, which is defined by the formula? 


^ 

J f'(s(s)ae - f (0) (42.22) 
for any suitable "test" function f (с). and any A. We will also need the 
formula? 

6(t-1')=-0 8 (t-t) (42.23) 


27 Homework: Show this. Try at first not to look at the comments below. 


?* Dirac Delta Function and Schwartz Distributions: This is put on a rigorous basis 
using Schwartz distribution theory. The interested reader who is unfamiliar with the 
theory of distributions is invited to consult the references. Only simple manipulations will 
be required here, and will be explained when needed. 


? Homework: Show this. It will give you some insight into the Dirac delta function. 
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Details: How the Green function satisfies the Diffusion Equation 
The fact that С, satisfies the singular diffusion differential equation can be seen 
by expanding С, (x —x';t+dt—-t ') їп а series in dt to perform the time partial 


derivative. Again, our notation is df = —At > 0. We need to take the limit 
t —t' to see the singular behavior. We need the replacement of the square 


displacement by its average, (x'- x) 2205 At, valid in the infinitesimal 
limit". We ignore higher order terms (x-x') G, or (t-t') G, relative to the 
leading — ó(x —x')Ó(t —£') term. Note ће (х —x') behavior of a Gaussian 
as ((— t) = At > 0. As we take the limit  —£' — 0 through negative values, the 
Green function becomes the delta function, С, >ó (x -x ') . This produces the 


required boundary condition at the expiration date with f'=¢*,x'=x*. 


The No-Arbitrage World and the Fictitious Stock Prices 
Next, we give a formal but simple argument leading to the path integral. First, the 
diffusion equation (42.4) implies that ехр(ҳі)С (x,t) is related*' to the 
probability a(S t) for a "fictitious" stock price S to have the value S(t) at time 
t in a “no-arbitrage world". The stochastic equation for S that yields the 
diffusion equation for a(S 1), namely Eqn. (42.4) with the term 7,C removed 
and @ substituted for C , is just 

dx(r) 

Ud = ц, + ола) (42.24) 
Неге, ў = In(3) and 7 =dz/dtis the formal derivative of z(t). This is the 
stochastic equation for x — In( S ) but with the replacement of the stock return 


Hg by the risk free rate у, as we found from no arbitrage. The return 4, is an 
irrelevant variable and does not enter in the determination of the option С. 


? Limiting Process: As indicated below, we really need a two-step limiting process. 
First we set dt = -At = Lót, and let 6t — 0 with L — oo, such that At is constant. This 
allows the replacement (х”-х)? by 2269. At. Then we let At— 0. 


?' Option, Probability Relation: The relation involves the second derivative of the 
option with respect to its strike. 
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Notation: dx(t), d,x(t), т@) and Brownian Motion 


It is useful to focus оп y(t) = dz(t)/ dt , a random Gaussian slope variable. It is 
also necessary to avoid confusion regarding differentials. The symbol " dx(t) " 


can appear with two different meanings, which we need to distinguish. 

For fixed time £, we can integrate over all possible values of the variable x 
at time ¢. The measure in this integral contains the ordinary integration measure 
dx specified at time f, ог dx(f) . On the other hand, the stochastic Langevin-Ito 
equation tells us how x at time ¢ differs from x at time £-- df for fixed but 
infinitesimal dt, i.e. x(t + dt) — x(t), which is the time difference, unfortunately 
commonly also called dx(t) . So instead, here we use another symbol, d,x(t) . 

We can draw a straight line between x(t) and x(t+dt), for a given path 
realization. The slope of this path segment line, which we call 7(t), is a Gaussian 


random variable. This is because a random walk occurs even for the infinitesimal 
interval between ¢ and t+dt. Brownian motion assumes that a random walk 
Occurs inside any time scale, no matter how small. See the figure below. 


Brownian Motion over Infinitestimal dt 


Slope of line is x(t + dt) 


n(t) = d,x(t)/dt 
d,x(t)=x(t+ dt)- x(t) 


Inside are still an infinite number of steps. The slope п( ) = d,x(t) i dt is 


a Gaussian random variable if the time difference d,x(t) is Gaussian. 
Unfortunately, d,x(t) is often called dx(t). This causes confusion when 
doing integrals over the x(t) variable, since dx(t) is just the ordinary 


measure on the x(t) axis. Discretizing avoids confusion. 
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Now integration over x(t) for all times ¢ between two times ¢,,t, is 
equivalent to integration over all paths between x(f,) and x(f,). One can 
picture this in two ways. First, one can integrate over all values of each x(t) at 
each value of Т. Alternatively, one can integrate over all slopes of all 
intermediate path segments. 

Any confusion as to the physical interpretation of what is going on can be 
resolved by taking finite discrete time partitions. It is sometimes helpful to think 
of x as the co-ordinate for a particle undergoing a random walk; the straight-line 
path segments can be viewed as free flight between successive scatterings with 
incremental scattering angles given by a Gaussian probability distribution (see 
Feynman, Williamson, Refs). 

The figure below generated by computer may also help with the intuition: 


Brownian Motion inside interval dt = 10^(-4) 


—%— path 1 
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Connection of Path Integral with the Stochastic Equations 


The path integral is not only fully consistent with the stochastic equations, the 
stochastic equations are directly used to get the path integral. Now in fact it is 
n(t) that is the fundamental starting point. We can start with the statement that 


n(t) is a Gaussian random variable and then use the stochastic equation to 


introduce di(t). 
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Consider the total conditional probability, which we call pitur *), 
This Og itt) by definition is the product of all the y(t) Gaussians, 
integrated over each y(t) with £ € (fd *) and over each X(t) variable? such 


that. #1) = х 


, and i(t*)o x*. In the discretization n, 2 n(t;) with 


j =0,....2—-1 and with /* = £,, the final 7, , is determined by the requirement 


n? 


&(1*)- x*, viz п = - = «| 
0 


The next formula for the Dirac delta function is the key to what we need: 


1= f d,,o(s,, - 9, - mdt- суй) 
~ R (42.25) 
= f ~ 1 1 ju =X 
i | а а c. dt Hs с, | dt ^ 
We get” 
n-2 © п-1 1 А А 
XX 3t,,t |= d ex “dt (42.26) 
— HH UE TOT (+074) 
= E IET (2 541 E u 
X AL MES C |7” o dt ° 
(42.27) 
Е : exp( Lg 2а) 
Uu л. 
n-2 © n-l 2 
= dx | X dt 42.28 
П] X y1 s (2ло га)" ] 2o. dt E X, | | ( ) 


? Notation: In this section only, we keep the tilde ^ notation to remind the reader that we 
work in the fictitious no-arbitrage world with the stock return replaced by the risk free 
rate. 


Chapter 42: Path Integrals and Options I: Introduction 591 


Here the product is over all £, є (557%) in the discretized version as the time 


step vanishes, dt — 0. The Dirac 6 functions eliminate the y(t) integrals and 


1 di(t) 
replace 7(t) by a E 


0 
labels it is best to use the discretized version”. 


— ц, | in the Gaussians. To avoid confusion over 


This result for Olea s *) is exactly the same as for the path integral 
for the Green function G, (x, = x*;t, — t *) up to the discount factor e "0 since 


the E ( )| are dummy variables. 


No Arbitrage: An Equivalent Approach 


The entire no-arbitrage discussion for the path integral is equivalent to specifying 
the drift parameter in the Green function. We could have started with an arbitrary 


drift and then insisted that the portfolio V (consisting of N. shares of stock and 


М. options) have a return equal to the risk-free rate. This constraint would then 


specify the drift. Viewed in this fashion, no arbitrage just consists of specifying 
parameters. This direct simplicity is characteristic of the path-integral approach. 
First the dynamics are specified and the Green function derived. Then appropriate 
parameters are specified by physical market constraints. These constraints 
constitute the no-arbitrage conditions. For further discussion, see App. B. 

We will encounter exactly the same idea when we apply path integrals to 
interest rate options. 

This concludes the discussion of the basic path integral framework and the 
Black-Scholes model. 


Dividends and Jumps with Path Integrals 


The incorporation of the dividends provides the first example where the path 
integral—or some approximation to it—is in general needed, even with European 
options. Dividends in the general case produce stock-dependent effects that 
cannot be treated analytically, except in a simple “dividend-yield” approximation 
or in deterministic cash dividends. Cash payments at specified dates produce 
jumps in the stock price across these dates, because the stock price includes a 
potential dividend before it is paid and the stock is less valuable after a dividend 
has been paid. 


33 Derivation hint: The last time step is important. The integral of the probability over x* 
is one. The first product in eqn. (42.27) then extends to j = n-1. This explains the apparent 
mismatch of normalization factors with eqn. (42.28). 
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It is worth stating that jumps in the stock price over short times can happen 
for a variety of causes besides dividends—bad news, announcement or 
cancellation of a takeover, etc. The formalism presented in this section is 
applicable to these effects also in some approximation. 


Let D(x,t) be the dollar dividend per share of stock per unit time. The 
normalization is made for convenience. D(x,t) will in general not be a 


continuous function, since dividends are paid at discrete times. The stock price 
changes by an extra amount — D(x,t) per unit time. Hence, we place this extra 


term divided by S(t) in the drift function for the return d,S/S of the stock 
price, where d,S(t) = S(t + dt) - S(t). Define the "effective drift function" 


u(x,t) by 
u(x.t) =, - oy - D(x.t)e"9 = Ih —D(x,t)e) (42.29) 


Again x(t) = In S(t). We first assume D(x,t)/S(t) -D(x,t)e ^? -D,isa 
constant. D, is called the dividend yield**. This common approximation makes 
the problem analytically soluble. For this case, the drift is constant, viz 
u(x,t) = Up = k -~ D) =H -Do -lg? , and so we immediately get the option 
in the same way as in the last section”. Noting the identity 


%4 Notation: The dividend yield denoted called Dy here is also called y and sometimes q. 


35 Dividend Yield vs. Cash Dividends, Dividend Models, Dividend Risk, and a Story: 
The dividend yield might seem less accurate than the cash dividend. The cynic would say 
that future dividends are just assumptions anyway (do you know what cash dividends 
IBM will pay in the future?). There are various models of dividends (historical, growing 
at a constant rate, growing with the forward stock price, etc.). Various assumptions can 
lead to significantly different option prices. Related disasters occasionally hit the news. 
See e.g. the article describing a big loss at a broker-dealer related to dividend models: 
“Blind faith", The Economist, 1/31/98, page 76. 


Complications do occur if trading occurs around the date of a dividend payment. I 
once heard an emerging markets trader recount with joy a trade with a competitor who 
was screwed (a technical term used by the trader), because the competitor used dividend 
yield and not cash dividends. For index options, there is less of an issue since dividends 
are paid by the various stocks in the index at different times. The risk, e.g. the change of 
option value with stock price S, depends on cash vs. yield dividends - since in one case D 
is constant and in the other D/S is constant. 

Here is the story. I was once called in to settle a problem. I walked into the 
conference room chock full of people, including the department head, who were baffled 
in various degrees by a risk report produced by their risk system. The explanation was the 
point above regarding whether D or D/S is held constant as S is moved. In retrospect, the 
incident now seems somewhat amusing. At the time, it all had the air of a serious 
courtroom drama, with arrows being slung in various directions. 
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[ x, =x" -дьМ | 
—2сА{ 


[ x, -x аул 
-20 A 


exp4x* = exp x, -(n,- D JAr 


0 


(42.30) 
Here, lp = д +O, produces the European call option®® result with d , as 


before but with 4 replaced by ш, 


C(x,,5) = S(t)e " N(d,)- Ее" N(d ) (42.31) 


+ 


Note that if the dividend yield is large enough, the option value C(x,,1,) is 
(x91) = $(@%)— E. Hence, the option 
— С from 


getting the intrinsic value while losing the option. For a European option, such 
early exercise is by definition not possible, but for American or Bermuda options 
that allow early exercise, this plays an important role. Basically the idea is that it 
may be better to buy the stock now at cost E and sell it, rather than wait to buy 
the stock at cost E after large dividends have reduced its value, overcoming the 
possible higher stock prices from random diffusion. We will study American 
options in the next section. 

The forward stock price, which is the average no-arbitrage stock price at 
some future time f, is always used in discussions of equity options. This is 


less than the intrinsic value С. 


intrinsic 


holder would, if he could, exercise the call and pick up a profit С, 


intrinsic 


Swa (t) = (S(t) = Гаа, (x xt —1)- SO (42.32) 


The result follows immediately from the identity Eqn. (42.30), using the fact 
that the integral over a Gaussian over infinite limits is one. We get 


S wa (t) = Sy exp] (n — Dy )(t- t) | (42.33) 


It is instructive to get this result in another way. We have for any Gaussian theory 


(exp(x)) - exp((x) - 1 )) (42.34) 


3% Options on futures: This case is reproduced by taking Dy = го so that the future Е has 
zero average return <d,F/F> = 0. Of course, futures do not have dividends but neither do 
futures have a deterministically appreciating component since they are not assets, so this 
formal trick gives the correct result. 
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Here, the second-order “connected part” with subscript ( ), is the usual quantity 


(22), = (6 H (9) ) = (x7) — (xy (42.35) 


Integrating the stochastic equation for x = In(S ) produces 


x(t) =» + Mp (t-t) о, [mat (42.36) 


Here, 7(t) = dz(t)/ dt as explained above, with the expectation of the product 
n(t')n(t") being a Dirac delta function?" 


(n y") = 5 (t'=t") (42.37) 


We get (x) 23, + up(t-t), (x7) =o [ar [ar (iy) = o5 (t-t). 


Hence the forward stock price (in the no-arbitrage world) with constant dividend 
yield is, as above, 


Sina (t) = 5% exP| (s - Dy)(t-5)] (42.38) 


Note that the forward price is independent of the volatility. The option price 
can be rewritten in terms of the forward stock price if desired. This produces 
useful insight. In the absence of volatility, the single deterministic path along the 
forward stock price path gives the dynamics. The volatility can then be viewed as 
producing fluctuations about the forward stock price. 

For deterministic and discrete cash dividends (i.e. specified dividends at 
specified times in the future), we can proceed as follows. If a first cash dividend 
D, is paid at time 1, then the stock price will drop by D, at г. The forward stock 


price just after the dividend is paid is S pq (4) = Sy exp[r; (t, -t )|- D,. Then 


S sud (t ) is moved forward to the next dividend payment at t, by a factor 


?' Delta-function expectation of 7(¢')7(¢") : This formula is equivalent to (dz y = dt as 
can be seen by straightforward integration. 
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exp б (t - )| . Continuing this logic, if а number of cash dividends {р " are 
paid at times fr » before time ź we get the forward stock price at time f as 


5 (t) = Sy exp[% (tt) ]- >, Dy exp[%(t-t,)] 4239 


ty «tg «t 


Taking the present value of this equation back from time і, we get the value 
of the stock with the effect of dividends subtracted out; call it 5, „л: This 15 


the spot stock price S, minus the sum of the present value of the dividends. 


Dividends are discounted back from payment dates to f, . 


S NoDividends = So ~ 2 D, exp E (t; nx lo )| (42.40) 


fy etg «t 


Consider the figure below. 


Jumps in stock price by amounts —D, at 


times tp. These can be dividends, or other. 


Lt 


It should be clear from the preceding discussion that in general, jumps in the 
stock price can be handled in the same way as for cash dividends. 
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In order to get the option with deterministic discrete cash dividends we can 
simply replace the forward stock price using the dividend yield by this forward 
stock price using cash dividends". 


If the payment times are uncertain but the distribution Ф me ((2)) of 


payment times is known, we would need to perform the appropriate time 
averaging. Similarly, if the dividend amounts are only known probabilistically 


with distribution Ф, ({D ү |) we need to perform dividend averaging. 


For general dividend functions, the Green function is now dependent on all 
{ и(х„ї)} . We call the Green function Gp (x,x';t,t'). The Green function is no 


longer a simple function of co-ordinate and time differences. In fact it depends on 
all intermediate co-ordinates and times”. Now, for an infinitesimal time step the 


propagation is free, ie. Gp (x,x';t,t') = С, (х —x't -t!) | for small 


Hy (x.t) 
enough ¢—t'. Physically, “small enough" is the time scale over which the 
dividend function varies appreciably. For dividends paid at discrete times, 
infinitesimal time steps are not needed since the semigroup property can be used 
to combine the free propagation between the discrete dividend payments. So, if 


and only if At 2 £—1' е 0, G, is given by the free form Gg) where 


rA (Co e „А 2 
po 23] exp >. | (42.41) 
| -220; Ar | -20At 


GÜ (x,x't,t') = 


The formal expression for the path integral is defined as the limit of the 
successive propagations with dividends paid over infinitesimal time steps, exactly 
as in the previous discussion without dividends. We get for finite number of steps 
n the result 


atata) @242) 


Gp Qs. x*51,t*) = | | dx CM CONS: 


35 Another model for options on stocks with Dividends: Sometimes the assumption is 
made for Brownian motion of the logarithm of Snopividenas . This is reasonable because, 
after all, the dividends are not stochastic. However this leads to different results for a 
given volatility than we have presented because the volatility part of d,S/S is not the 
same. In order to get the same result the volatility would have to be modified by the ratio 
SnoDividends/S. Just another ambiguity. 


? Notation: We have left off the tildes on the x variables since by now the reader should 
be used to the idea that we are using the no-arbitrage dictum. 
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The result has the same form as for the free diffusion case but now with the 
general drift. In the limit и — оо we get the formal functional result 


Gp (Gaia) me Í | Dx(t) Jexp | en o) 


Paths with fo 20, | dt 
x ty 7*9 
x(t*)=¢* 
(42.43) 
The (exact) equation satisfied by Gp (x,x';t,t') 18 4! 
д, + u(x,t)0, +40, д — , |G; (x.x551') 2 -C0(x-x')ó(t-t') 
(42.44) 


Proof of the Path Integral Satisfying the Semi-Group Property 
It is important to note that the semigroup property over finite times is satisfied by 
the full path-integral expression for G, exactly. Thus, for finite time intervals 


(10,4) and (4,t,) we have the exact result 


Оол) EREET E (x,x,5,5) (4245) 


—00 


4 The Backward Diffusion Equation: Because options involve backward-moving 
logic, we exhibit backward (Kolmogorov) diffusion equations, which do not involve 
derivatives of the drift or volatility functions. The backward equation varies the initial 
space point and time. Forward (Fokker-Planck) equations that move the final space point 
and time do involve such derivatives, as shown by Uhlenbeck and Ornstein (Ref. viii). It 
would seem like x-derivatives of u(x, t) could appear in the backward equation that 
contains differentiation with respect to initial variables (x, t) because the Green function 
does after all depend on u(x, t). In fact, such derivatives do appear at intermediate stages 
but then these terms cancel out at the end. 


^' No Unitarity here: A point regarding backward and forward propagation involves 
“unitarity” which holds in quantum mechanics (QM). In OM, if we first propagate from 
space point xo at time to forward to space point x; at time t; > to, and then propagate 
backward in time to space point yo at the initial time to, and integrate over x;, the result is 
the Dirac delta function (xo - yo ). For real diffusion, which is the case here, unitarity 
does not hold even for constant drift and volatility. Propagating forward and then 
backward, and integrating over x;, leads to a nonzero result regardless of хо and yo. 
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The proof is immediate, as illustrated in Fig. 3 at the end of the chapter. 
Gp (хо,2х,;70,) contains consecutive propagations with integrations over all 


x(t) with t£, «t «tj. Gp PNE contains consecutive propagations with 
integrations over all x(t) with <<. Finally, Gp pde) contains 
consecutive propagations with integrations over all x(t) with 4, «1 < tj. All the 
propagations are present, and all integrations but one over x, are accounted for. 
Therefore, the x, integration must be inserted over the product of the two 
propagators С, (x,x:45,5) Gp (x,,x5;5,1,) in order to get the overall 
propagator Gp (x, x;;1,. t, ) . This is just Eqn. (42.45). 

The equation for the option price C (3555) is found simply by integrating 
the payoff value multiplied by the Green function exactly as in the Black Scholes 
model. The equation is the same backward equation as satisfied by the Green 


function, since the differential operators wind up acting only on the initial step in 
the multiple product defining the Green function. The result is the Black-Scholes 


equation replacing the constant drift 44, by и (x; ES ) , and is 


oC oC 
L——mG- ‚1 
at, C — (o. t5) ax, 200 xd 


(42.46) 


More development of the general drift formalism will be given in the 
discussion of the Macro-Micro model in Ch. 47-51. 


Discrete Bermuda Options 


We now consider an option with multiple strike or exercise dates. These multiple 
dates are said to form a schedule. If the schedule is discrete, the option is called a 
Bermuda option. If exercise is possible at any time (a continuous schedule) the 
option is called an American option. Relative to a European option, an option 
with multiple exercises is clearly worth more, since the option holder has the 
right to exercise at more times than just once. Determining this extra “premium” 
uses rather messy logic and takes a lot of numerical effort. The logic is called 
backward induction. To gain insight, we present the general formalism using path 
integrals". A special case was first treated by Geske and Johnson ". 


? Dividends: We will include the dividend-yield case for simplicity in the development, 
but as we shall show the results can easily be generalized. We need dividends for call 
options because otherwise there is no premium of American over European options. For 
put options, there is a premium even if no dividends are present. 
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The essential problem is in the boundary condition at each strike date г; 
where /=1,---,L. An optimality condition is used to determine whether the 
option is exercised^. The idea is that the option holder, as the time on his watch 
moves forward, asks at each ü whether holding or exercising the option is more 


profitable. The answer again involves boundary-condition information given by 
convention in the future. This information is propagated backward in time by the 
model equations from the future times where the boundary conditions are 
applicable to the time on the watch. 


Consider a call option. Suppose that the stock price at 1, satisfies 
S (5) > E,, with E, the strike price at time b . Then the option is in the 


money and so can be exercised profitably. Theoretically, this exercise may not be 
the most profitable. However, it is possible that the anticipated return from the 


collective possibility of future exercise at the later strike dates us with j / is 


greater than profit equal to the intrinsic value 5 ()-Е у from exercising at 


time Ds We need to find the points at which there is no difference in value 
between these two possible courses of action. We next turn to this problem. 


The Atlantic or Critical Path for Bermudas 
In order to discuss this problem it is convenient to divide each x, axis into two 


А th ; А : 
sections, the / "European" section and "American" section, separated by a 


* 


point x: For lognormal dynamics, Я, = In(S(¢; )). We will soon discuss 


American options defined by an infinite call or put schedule (i.e. L — oo), and 


the points E ^ will merge to form a path s(t), which we shall call the "Atlantic 


ра". The Atlantic path separates x(t) space into the "American region" and 


? Do American option holders really employ the optimality condition to decide to 
exercise: Unless the option is liquid and the market price is known, probably not, because 
most option holders do not have software to price American options and therefore could 
not figure out what the optimal condition would say. Sometimes in practice, exercise 
decisions are probably made on the basis of some scenario for the underlying believed (or 
feared) by the option holder, not by averaging over all paths using complex logic. Real- 
world financial aspects may enter as well. See the discussion of the Viacom CVR 
described in Ch. 13 for an example. 


^ Why the name Atlantic Path? This name is intuitive because this path lies midway 
between the American or Bermudan region where early exercise takes place and the 
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the "European region" formed by the continuum of the American sections and 
European sections. For the discrete case, х, > X, for calls and x, < X, for 


puts defines the 7 ' American section, where the option is exercised at time t, І 


The 7” European section, where the option is held at least to time L ap 18 
Tr « x, for calls and Ly > х, for puts. The Atlantic path then consists of the 


discrete set of points cae The idea is illustrated in Figs. 4A, 4B at the end of 
the chapter. 


Path Classes 

Leave aside the problem of actually getting the discrete Atlantic path m fora 
moment. Then it 1s easy to see that paths break up into classes. If exercise occurs 
at time b , not before, clearly the paths must cross all 7 axes with 1< j «7 


* = * e . " * 
such that x, <x, for calls, x, > Xj for puts. Since exercise at t = t, means 


the procedure stops at t; , no path continuation for t > P is relevant. Between 

successive times ( iba) and before exercise, the paths are unrestricted. For 

this reason, the propagation between such successive times is free and is 

therefore effected with the free propagator G, (x, – х; ut -f л with 
L= 4, — D, . This is also true for j = 0 if we interpret x, = x h =h. 


The idea is illustrated in Figs. 5 and 6 at the end of the chapter. Fig. 5 shows 
the path classes for a call schedule and Fig. 6 corresponds to a put schedule. 


Contibutions from Path Classes for Bermuda Options 
Hence, we write the option value at present time /,, with present stock price 


S (t) = exp(x, ). as a sum over contributions from path classes labeled by 7, 


0) (42.47) 


Mr 


С (а) = 


^s 
11 


1 


European region where exercise does not yet occur. Usually this is called the critical path 
or the free boundary. 
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For a call option specified by a schedule with possible exercise at t, , we have? 


oo 


ch EN = J | Goall (хь ы, |8) =, Ja; (42.48) 


= са 
x, 


Here, the Green function G 


Ci 


ап 1$ built up by a product of free propagators 
GP =G; |р With drift и = ду — D, . Thus, 


geall 
f-| ^j /-1 Т 
E x REN * и * жо ж * 
ба) | йх, G; nex m0) (42.49) 
j=l —oo j=0 


This holds if 7 > 1 (for / 21 no integrals are present). 
For a put option specified by a schedule with possible exercise at b , we have 


x put 


C (xt )- f G (06) E - 5, ) |a (42.50) 


Here for / >1, 


f-| > f—| 
a * — * (u) * * . * * 
С = [ | | n) | | G; 5 ed. ast, -t".,) (42.51) 
J 


In order to evaluate the discrete Atlantic path {х j ie we will need the option 


values at times f; ,, and points x; ,, , with m=1...L. We have 


L 


PON 6 v) (lias) (42.52) 
/-L-m4 
For a call, 
ё * * ЛЕ ж * ж ж * * 
cu СЕ ae Б T Goal (x uox um) (e )- E |а (42.53) 


х, 


? Intrinsic Value Positivity: The intrinsic value in the region of integration is positive. 
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Here, for 7 = L—m+1.,...,Z the call Green function is 


ZEE f-1 
* * ож *\ * (u) Ся "E 
G ai Eos es | | | йх, | | G; 5 TX it; 7 A (42. 54) 
j-L-m*l —oo j-L-m 


C8) (хт poll) = J G „хуй mt || E- S (t) | (42.55) 


NO NEA П | a П С (а P.) (42.56) 


Semigroup Properties for the Call and Put Green Functions for Bermudas 


We note in passing that С and G,,, satisfy different semigroup properties, 


put 
written again for finite time intervals (t cup ) j (t m ) І 


call 


Gil а B dx G, (x, as ecu tot.) (42.57) 


Golem s T 


bd TURA XX 3t 1,)G Т t) (42.58) 


1 Gul% 1*0 mo 22°12°2 
> put 


1 


Two generalizations of these results are possible. First, with appropriate 
integration limit changes, a mixed schedule of calls and puts can be 


accommodated. The eae which are different for calls and puts, would also 


change. Second, the general dividend case with u(x,t) discussed earlier can 
evidently be accommodated by replacing each free propagator 
Gl "(gr -x — tı ) with the path integral С, (х т pos ba] of Eqn. 


jM"j jj 
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(42.42). In using this equation, the index j on the right-hand side should be 


replaced by /' = 1,...,n for the partition between times (6 nm m" into n intervals 


in an obvious fashion. 


Atlantic Critical Path Algorithm 
We turn to the determination of the discrete Atlantic or critical path formed by 


the set of points | a for a call schedule and n ut for a put schedule; here 


/ =1,...,L. This procedure is well known, but we present it anyway for 


completeness using this language. At time t = b , the last possible exercise date, 
х= In(E 8 for both calls and puts since no ambiguity is left to the option 


holder who holds his option to t (either the option is in the money or it isn't, and 
no further possible decisions exist). 
Now consider t = bo . An option holder who holds the option until this time 


will get + [5 (Са ) – Е -] for a call (+) or put (-) if the option is in the money 
and if he exercises. If he doesn't exercise, he keeps his option with value equal to 


co (x btr a) defined by Eqn. (42.53) or Eqn. (42.55) with / = L , equal to 


* 


C EOM rom , Which is the option value diffused back to Е from t. The value 


of x, , is defined as that value such that no difference results from exercising or 


1-1 
not exercising at baa For a call this is 


exp(¥ m)- E, =œ) 


call 


[Und Ü )>0 (42.59) 


Хб 


For a put we have ће corresponding condition: 


E, exp elm CORRI Ü 


put V 1-1? 1-1 


)> 0 (42.60) 


This process is continued. At PS the option holder may expect to get 
C (а ый е the sum of two terms from the anticipated possibility of 
exercising at times - (4=L-1) or i (7 =L). When the positive quantity 
C (о) is set equal to the option intrinsic value at is as in Eqn. (42.59) 


or (42.60), x, , is determined. In general, X, , is determined by setting the 
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quantity C (йы 5 bu equal to the option intrinsic value at bs . All paths start 


. . * . . * 
in the American sectors of the x,_,,,, axes and cross intermediate x —ахеѕ 


back to - in their European sectors. In this way, all the parameters of the 


discrete Atlantic path ae for a call or x i for a put are determined. 


Again, the idea is illustrated in Figs. 5,6 at the end of this chapter. 

This completes the determination of the discrete Atlantic path for a call 
schedule or a put schedule. Some options have mixed conditions—at some times 
the option decision is a call and at other times a put. For such a mixed call/put 
schedule, careful tracking of the various American and European sectors of the 


X -axes must be done to get the Atlantic path points is ў е 


Expression for a Bermuda Put Option 
To close this section, we exhibit the form of the put option price including L 
puts in a schedule, assuming lognormal diffusion and a dividend yield. _ 

The case L =3 for puts was written down by Geske and Johnson ™ , as we 
describe below. They also saw that the American put option would be the 
L — oo limit with E, = E , a constant American put option strike price. 

We use constant effective drift 4 = 4, — D, and volatility o,. As derived 
above, the put option price breaks up into a sum of terms. For simplicity, we set 
T= 7 Р -t equal to a constant. Then the 7 n put term in Eqn. (42.50) has the 


explicit form? 


_ peta > : 

d (x,.4)7 e d 5 J 0 
s 
Spi 42.61 
ри | [E, _ 5(4)| /—1 (x; E аў (42.61) 
] dx, . v Хр — 5 
—— (2лсут) j=0 Oot 


Now make the changes of variables to у, (х,и) = [s = Xo - juz)/ A joi? 
for j 21,2,..,7. There is no interlocking of limits in the variables of integration 


since x, is fixed. We find 


cu (х„)=е EI, ({ y" |) ЛЕ ({ p" |) (42.62) 
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Here у?“ = у (х = УЫ and y^ = yj (x; =?" fi, ). As before for 


dividend yields, 4p = 4 – Р =} - D, -loj and Дь = up +0. We also 
have defined the integral 


m f£-] © j уу / 
iff "+ J d AE J А а= 
і d (42.63) 


(42.64) 


The integrals of Eqn. (42.62) can be related to the multivariate integral notation 
by identification of variables. For 7 = 3, the trivariate integral appears as related 


to our quantity /,. Denoting the trivariate integral as М, (h,k J; ОБ Ри 08) 


we have 


I (550 7) = N (C P,- 9“ .– 1 s = 3) (42.65) 


The multivariate notation is potentially confusing, since only correlations 
between nearest neighbor co-ordinate variables actually exist in the propagators 
by construction. However, the notation for the trivariate integral seems to suggest 


the existence of a next-to-nearest-neighbor correlation ,, between variables 1 
апа 3. In fact, ће y,, y, terms cancel out. This completes the discussion of 
discrete-schedule Bermuda options. We turn next to American options. 
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American Options 

The American call or put option, as mentioned above, is just the infinite L limit 
of a call-schedule option or a put-schedule option. All the work has been done. 
Integrals over all the 1%, } become integrals over x (t) for all ¢, and since 
these are dummy integration variables, we may remove the star * and write 


integrals over all x(t), i.e. over all values of x at each t. The Atlantic path 


x (£) is formed by the continuous limit of the is B as the difference between 


* 


+, 7 + . . 
the decision times goes to zero, f,—1,,, — 0. Now propagation in the 


ca 


Е Il 
European region means integration over x(t) such that x(t ) «Xx (t ) for a call, 


E ut 
x(t) Sat) for a put. 
The reader may wonder about the propagation between DI " in the 


А . * * 
continuous limit. For the discrete case, x(t) was unrestricted for і, « t < NE 
Now a path x(t) must stay restricted to the European region except at its 


starting point. The resolution is that the excursions of a path x(t) away from 
. * * . 
European sections between (6 sty a) are suppressed by the Gaussian propagator 


* * ae . . 
to a greater and greater extent as t, —t,,, — 0. This is because there is no time 


for the path to perform a random walk away from the European section at one 
decision time and still get back to the European section at the next decision time. 


Finally, the discrete numbers E, are replaced by E, the American option strike 


if constant or by the appropriate time-dependent strike price E (t) А 


Appendix 1: Girsanov’s Theorem and Path Integrals 


In this Appendix, we give two straightforward and related derivations of 
Girsanov's theorem using path integrals *'. 


First Derivation of Girsanov’s Theorem 


The essential point is contained in the simple remark that a shift in the drift of a 
Gaussian produces the original Gaussian multiplied by some factors. Thus, 
consider the typical Green function used throughout this paper for propagation 
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with drift w= u|x(t).t] and volatility o = c [x(t).t] in an infinitesimal 
negative time interval At=t-t' with space displacement equal to 
x—-x'- x(t) - x(t — At), 


(x —х'— пл) 
20° At 


G” (x,x'st,t') = —- (42.66) 


| -270° At | 


We have indicated the dependence of С on yw explicitly by the superscript. 
Aside from technical points of convergence of integrals, the exact dependence of 
и and o on x(t) and f. is general and irrelevant to the discussion”®. 

Consider the path-integral expectation-value of some arbitrary quantity 
Y [{x(0), 4} | with respect to G^. for times between t, and t, , defined as the 


At — 0 limit of 


eb]. FLL too esr e n4] 


tt, m 


(42.67) 


E.g., Y could be a discount factor, or an option payout depending on x(t) А 
For Y = 1 we just recover the path integral for ће finite-time propagator. 
Now suppose we rewrite 4 as consisting of two pieces, 


U= by + 4 (42.68) 


Here, 4,and ш may depend on x(t) and і. A little algebra produces 


2 
G^ (хх) = exp| - (x - x At) + I G^ (s, x';t,t') (42.69) 
с Oo 


) 


Here, Gi“) is the same Gaussian as G“ 
(и) 


but with ш replaced by 4,. 
Now the interpretation of С“ as the infinitesimal-time diffusion propagator 


is consistent with the statement that the stochastic equation for x(t) is 


46 First use of Local Volatility o[x(t),t]? This proof, written up in 1988, was perhaps 
one of the first times that local volatility, 1.e. volatility depending on the spatial co- 
ordinate, was used in finance. 


608 Quantitative Finance and Risk Management 


x(t)—x(t-At)- u| x(t),t]At=-o[ x(¢),¢]Az(t) — 42.70) 


Here Az(t) is a Gaussian (Wiener) random variable. 
On the other hand, the stochastic equation producing the diffusion equation 


for which a is the infinitesimal propagator is 


x(t)- 3(t— At)- и, EON - -o| s(t).t ]Az(1) (42.71) 


Here we have put a tilde over the variable x as a label to indicate that the 
drift is 44 rather than рш. Actually, since both x(t) and z(t) serve only as 
dummy integration variables, this label is not really needed. 

In the integral Eqn. (42.67) we may change variables from {x(t)} to 


is(0)] . In performing this change of variables, we need to keep the 
probability density weight for a given path unchanged (see footnote 7). That is, 
for a given path specified at times fr {| by numbers {x Me “| ‚ the integrands of 


(path) (path) 


the original integral at х, = х, and the transformed integral at x, =x, 


must have the same value. Schematically using an obvious notation for the Green 
functions propagating over neighboring times, we mean that 


„ = Invariant (42.72) 


j+l n-l 


n-2 
de = П G, ахі") . G 
j-0 


This is clearly satisfied when we insert Eqn. (42.69) for each infinitesimal 
propagator. We may then use the stochastic equation for x(t) in Eqn. (42.71). 


We denote the positive infinitesimal time step dt =-—At, take the limit as 


dt — 0, and call dz = Az. Denote the drift function д = и | (t),r | and the 
volatility function 6 = c|s(t).t ]. We obtain 


оно ae) ear 
Gi) EGE G: dt);t,t+ dt v[(x(o).]] 


(42.73) 
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= {of f At 04] E (42.74) 
t, D» Go 


Eqn. is the desired result, Girsanov's theorem. 


A Second, Quicker Derivation of Girsanov's Theorem 


A more succinct derivation uses the stochastic equations as delta-function 
constraints in the path integral, as described in the text. We just write down the 
path integral expectation including the stochastic equation as a constraint twice. 


e First, use the full drift р. 
e Second, use only part of the drift 44, but include an extra factor 


exp|[é[s(r).: ]). This extra factor is chosen to force the identity of these 


two procedures. 


Carrying out the two steps produces the identity 


EOE cu = lim] Í a) dn(t) z; Т exp| lg! (2) dt | 


о-н). = 0] 


(42.76) 
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Since the Dirac delta functions just eliminate the integrals over the TION 


variables, it is easy to find this factor ef just by comparing the exponents in the 
integrands. The answer is the same as before Eq. , reproducing Girsanov’s 
theorem. 


Appendix 2: A Short Dictionary of Common Notations 
For the convenience of the reader, below is the translation of other nomenclature 
into the language of path integrals: 

e Expectation E (.): Writing (Y ) =E (Y ) gives a standard notation for 


expectation. Identification of 44,44, and Y in various circumstances 


provides the explicit connection between the notation used in this chapter 
and other notations. 


e Numeraire A. Reciprocal of the factor multiplying the Gaussian in 
Eq. (42.69). 


e Measure dg: The second factor in Eq. (42.69) times the ordinary 
calculus dv(r), viz ac (t)= x(t), x(t— At);t,t- Ar Jax(r) 
with time step At<Q. This is the probability of transition from 
x(t- At) back to x(t) within the “bin” of height dx(t) (an ordinary 


calculus differential) around x(t). 


e Filtration VAUDE Market information over time up to time ¢,, with 
probability given by the path integral (for a given time partition) as an 
iterated convolution of ordinary integrals up to time ¢, with differentials 


Пс“ -dx(t ) , equivalent to stochastic paths. Recall that stochastic 


equations are used directly to obtain the path integral. 
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e Radon-Nikodym Derivative: Write С = f dG) and also equivalently 


(uo) 
С = и We call dG”) = 


dG). The quantity 


dG 
dG? N (д) 


is ће Radon-Nikodym derivative. 


e Martingale: A martingale is given by expectation Y (t) = (Y ) =E (Y ) 
with the appropriate Green function, viz: Y (t) = (Y ERORIS in 
Eq. (42.67). 


e Brownian Bridge: Paths from a point x(t- At) back to a point x(t ) 


within the ordinary calculus differential dx(t ) as а bin around x(t ) at 


time ¢. The probability measure is as above. 


Appendix 3: No-Arbitrage, Hedging and Path Integrals 


This appendix deals with no arbitrage and hedging in the framework of path 
integrals. It is important to understand that the usual no-arbitrage conditions are 
directly implied by the path integral formalism when coupled with the standard 
hedging recipes. The basic reason is that the Green function or path integral is 
itself the general solution of the diffusion equation. The diffusion equation and its 
boundary and terminal conditions determine everything about an option. 
Therefore, any statement about no-arbitrage or market consistency of the option 
must be possible by adjustment of suitable parameters in the path integral. 

Although equivalent, the procedure here is somewhat different from the 
textbook procedure. We first construct the Green function directly. The form of 
the Green function does not need any hedging argument, just the stochastic 
equations and their Gaussian nature. Then, through a suitable choice of the drift 
parameter in the Green function along with the usual assumption of standard A- 
hedging, we arrive at the no-arbitrage result. Hence, there is consistency, though 
perhaps this approach can lead to a somewhat modified philosophy. 

For illustration", consider the simple zero-dividend stock option with 
lognormal dynamics using the variable x = In S . The Green function probability 


^' Generality of the No-Arbitrage and Green Function/Path Integral Method: The 
method is general. For no-arbitrage in fixed income curve construction using path 
integrals, see the next chapter. 
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density measure G,.dx for the transition from (x,.£,) to (x ,f ) in dx with 
* . “4° . 
Т» =t —t is the familiar quantity 


* * * * Ө (Ty ) e lv * 
Gade = Сх iu )йх = ехр[-Ф,. |ах (42.77) 


Мале 


Неге, 


Ф. = [х -x- T {| 261 | (42.78) 


A European option C, = C(S, d) with terminal value C(S",r) is 
Cy =С(5%,ь)= | Gox s.n )c(s E )dx (42.79) 


The Path Integral and the Green Function Go» 


It is important to realize that the Green function G, arises straightforwardly 
from Gaussian integration and the usual stochastic equations. For convenience, 
we again summarize the procedure. We partition the time 7), = p= t, by N+1 
* . 
t =t y» andx, =In(S,). The Gaussian 


integrals are over № independent Gaussian variables 7, (j = 0,..., N — 1) with 


points /,, and we set fj —f, ,, 


width 1/dt j» Which simply defines the probability measure. The stochastic 
equation of motion is d,x, = xj, — x; - (uf? toy], dt;. Here, и, the 
physical drift at time £, , is not the no-arbitrage drift and will be eliminated. 


We insert a Dirac delta function constraint arising from the stochastic 
equation in each 77, integral? , viz 


1- | 4хб хы (9 +o, Jat, (42.80) 


Dirac Delta Function: To remind the reader: the delta function à(w) is a Schwartz 
distribution, equal to 0 except at its support (w — 0) where it equals infinity, and has an 
integral of 1 when integrated over an interval including its support. 


Chapter 42: Path Integrals and Options I: Introduction 613 


We rewrite the 5 —function to eliminate each 7, in favor of dx,,, and 
perform the Gaussian integrations over the x,,, variables. 
We also insert the product of the discount factors IIev(-at " needed to 


properly discount cash flows}. The formula above for Со then emerges in a 
n-l 
straightforward fashion. It is a function of the total variance Gods = У оза z 
j=0 
n-l n-l 


the total drift д7 = У UC dt, and also №7, = У һа j- Ultimately, we 


j-0 j-0 
take the continuum limit dt, — 0. We take this after physical quantities have 


been calculated”. This makes everything well defined. 


The Elimination of the Drift u for the No Arbitrage Drift u 


So far, we have been working with gum. This is actual future drift of In S , and 


is not related to other parameters. However, this is not what we want for the final 
answer. For financial reasons, we want to impose a market no-arbitrage 


constraint. We can easily do this. We remove the stock drift ue from G, and 
in its place simply insert a drift parameter 4 that we specify to satisfy the no 


arbitrage constraint". We also need make the usual assumption of A-hedging. 
The hedging assumption is separately imposed, as in the standard textbook 


? Discounting: To get the usual formalism, we use continuous discounting. 


? Continuum Limit Again: Is it Relevant? The continuum limit is a fiction. All sorts of 
time scales exist in trading and in numerical calculations that require partitioning of the 
time axis. A good part of this book is related to dealing with time scales. 


5! More about the “Fictitious World of No Arbitrage" and The Real, World: 
Sometimes the replacement of the parameter u for the physical log-stock drift u® is said 
to place the calculation in a fictitious world. Our philosophy is more mundane. We 
merely state that, since no one would agree on the stock drift anyway, markets are 
facilitated by normalizing the calculation to produce a financial result that everyone does 
agree on — namely the no arbitrage condition. 

Still, the motivation of customers and traders for transactions with options live very 
much in the real world where people have their own favorite scenarios for the future 
average stock behavior u^, which may differ markedly from the no-arbitrage drift. The 
same philosophy drives transactions in fixed-income markets. If scenarios for the real 
behavior of interest rates in the future differ markedly from the expected behavior of 
interest rates produced by no-arbitrage, people will do transactions to try to capitalize on 
their views. 
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argument. The result is the removal of the uncertainty in the return for a portfolio 
of stock and an option. We next give the details 


No-Arbitrage, Hedging and Path Integrals: Result 
The standard hedging results and the no-arbitrage condition at f, for a stock 
option are equivalent to the following statements”: 


QC, 
1. HEDGE: Hedge the option with the usual №, = —A shares, where A = —— 


0 
2. NO ARBITRAGE: Replace the drift i with parameter 44 = 7, — о; /2 


Proof. The Green function G, satisfies а diffusion equation". Because the 
option C, is a given by a suitably convergent integral over G,., differentiation 
can be interchanged with integration, and so C, satisfies the same diffusion 


equation as does G», namely. 


a aC, ‚ „ 26,26 
2 2 "" 3x дь 


nC, -0 (42.81) 


Construct the usual portfolio V of №, shares and an option. The change 
dV of the portfolio over time dt, is V(S,,1,) -V(S,,t,). Here the initial 
portfolio is V (Syst) = №8, +С. Also, V (S.t ) =N,S,+C, is final 
portfolio is with S, the stochastic variable given by its stochastic equation. 

The option value C, = C (Sit) is the expectation integral using the Green 
function бу = G(x,,x 3t,,¢ ). Неге С, is Gy with the initial step from f, to 


t, removed. We have 


C,=C(S,,4,) = [ GG. 5. )0(8,à ax (42.82) 


? No-Arbitrage at Other Times: Similar replacements of the ц; are made at t; to 
ensure no-arbitrage at other times. 


53 Which Diffusion Equation? The equation is with respect to the initial variables So, to 
and so is actually the backward Kolmogorov equation. 
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We get the relation between the Green functions G, and G, good to 
O(dt,) by straightforward algebra: 


G(x, st,,f )= (5, –5,),+ dió +4(S, -Y Ba) GG 0) 
(42.83) 


We need to interpret (5, — 15 А, as its expected value, namely 
(S,-S, y = S o, m, dt, = S;0,dt,. To see this, we simply use the expectation 
value of 7; over its Gaussian measure with width 1/41, obtaining" 
(m) = l/dt, . This is equivalent to the result from the Brownian motion of all 
paths from (Sosto) to (S, RA ) , using an arbitrarily fine partition inside the small 


interval dt, . 


We get the change dV over time dt, as 


S -S 
z +52308 ) 5, HO (1-102) +С, (42.84) 


а, 25,1 dt, ° 88, 


HEDGING: The hedging relation N, = — С, /OS, makes the first bracket 
disappear and eliminates the dependence of V on the stochastic behavior of the 
stock price S, at f,. 


NO ARBITRAGE: Using the constraint 44, 27, —10; produces 7,N,S, 
for the second term. The standard no-arbitrage condition emerges, i.e. the return 
of the portfolio V of the option and the proper stock hedge is the risk-free rate, 
viz 


Ў (42.85) 


dt, ° 


54 Relation to Ito's Lemma: This statement is equivalent to Ito’s lemma. The variable n 
used here is related to the usual Wiener stochastic variable dz with dz? = dt by the relation 
n = dz/dt, where we discretize first and then let dt — 0 after all physical quantities are 
calculated. It is less confusing to write ат than to write ddz. 
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Appendix 4: Perturbation Theory, Local Volatility, Skew 


This appendix briefly describes the formalism with a local volatility o(x) using 
55, xix 


perturbation theory 
Call the velocity u(x,t) = [x(t + dt) - x(t)]/at . Recalling Eqn. (42.20), 


write the Lagrangian (x,t) = [uw + ig? (x)- Ji До? (х) | , апа call the 


unperturbed Lagrangian &% = X [o(x) — с, | with o, constant. The path 


tp 
integral Green function is G^ = Í | Dx(t) Jexp - [&x.t)at : 
Paths with t 
x(1 x; s x(t, Ft, 
Next, define the “potential” V(x,t) via X =& —V . The perturbation 


m 


a 


th 1 
= 1 
expansion is constructed using exp f V(x,t)dt | = 2 f V(x,t)dt | . The 
m 
ta 


ta m=0 . 


Born approximation, given by ће m = 0,1 terms, is. G^ = GP + Сы. Here, 
th oo 
ab ab ab ac cb 
G? =G” [K] and Gy, = | dt, | dx.Gz Y (x); . 
t —00 


If o^ (x)= o; + g^ (x — €^ ], the historical example that came up in the 
1980's, volatility decreases for increasing x. Expansion for small g gives 
analytic results. See Ch. 6 for more information on skew. 

The perturbation expansion is singular, because V is velocity dependent, 


requiring the Schwinger formalism" for the discretization specification. 
Integrations by parts are needed. There is a “mass renormalization” involving 


counterterms °° All 1/47 terms cancel, producing finite results as dt > 0. 


Figure Captions for this Chapter 


Fig. 1: A path in co-ordinate x-space, running backward from the final point x 
at time Г, through intermediate steps to the present time f} at x. Propagation 


5 History: Andy Davidson had the idea for the local volatility o(x) in the context of 
mortgages in 1986, which was very early in finance. The V(x,t) that resulted was called 
the *Mortgage Potential". 


5 Mass Renormalization: This is the only finance example I know where this somewhat 
advanced path integral technique had to be employed. 
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over each infinitesimal time step At ғ 0 is accomplished by convolution with 
the "free propagator" Green function G,, a solution to the underlying diffusion 


equation for infinitesimal At. Multiplication of successive propagators for a 
given path produces the probability density weight for that path. The path integral 
is obtained formally in the limit n — oo. Numerical analysis employs finite n . 


Fig. 2: The semi-group or reproducing kernel property for the free propagator 
corresponding to one-dimensional free diffusion. The significance is that the 
same functional form holds for the propagator between any two times. This 
property is valid in general for the path integral for an arbitrary diffusion process, 
and it holds in any number of spatial dimensions. 


Fig. 3: Illustration of the semigroup proof for the path integral denoted here by 
Gp in one dimension. 


Fig. 4a illustrates the geometry for a call option schedule. The European sector at 
each strike date (where the option is not exercised) is separated from the 
American sector at that strike date (where the option is exercised) by the point on 
the discrete Atlantic path at that strike date. The strike or intrinsic prices are also 
shown assuming lognormal diffusion, although the construction itself is more 
general. 


Fig. 4b illustrates the same for a Bermuda put option schedule. The American 
option is reached formally in the limit as the number of exercise strike dates 
becomes infinite. In that limit the Atlantic path becomes continuous. 


Fig. 5: The path classes for pricing a Bermuda call option with a schedule 
containing L exercise or strike dates. The paths are only generically illustrated; 


^ : * 
they must lie below the call-Atlantic path points ш at the exercise dates t, on 
the schedule, but between exercise dates, they are unconstrained. If pricing a 
callable bond, the points qum expressed in bond price space would decrease 


because the bond price at maturity is par. 


Fig. 6: The path classes for pricing a put option with a schedule. The paths must 
lie above the put-Atlantic path points at the exercise strike dates on the schedule, 
but between exercise dates, they are unconstrained. 


Fig. 7: Illustration of the path integral in two dimensions. See Ch. 45 for the 
mathematical formalism of the path integral in multiple dimensions. 
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43. Path Integrals and Options II: Interest Rates 
(Tech. Index 8/10) 


Summary of this Chapter 


We present the general path-integral framework for one-factor (short-term 
interest rate), term-structure-constrained models’. These include Gaussian, mean- 
reverting Gaussian (MRG), arbitrary rate-dependent volatilities, and memory 
effects. It is shown how the stochastic equations for rate dynamics are built 
directly into the path integral. Analytic results are derived by evaluating standard 
calculus integrals. No previous knowledge of either the models or path integrals 
is assumed in this chapter. Those familiar with models may regard this chapter as 
continuing the pedagogical introduction to path integrals. Those familiar with 
path integrals will benefit by this straightforward presentation of the models. 

This chapter is based on my 1989 paper. My derivation of the Mean- 
Reverting Gaussian (MRG) analytic model reported in this chapter was done 
independently and roughly concurrently with other authors >“. The path integral 
method for me was merely a tool used to derive solutions to models. 


' Acknowledgements: I thank A. Davidson, T. Graham, G. Herman, F. Jamshidian, R. 
Jarnagin, A. Litke, A. Nairay, H. Stein, and X. Yang for helpful discussions. I thank 
Bloomberg L.P. for some support in 1989. Finally, I thank Ben Forest for his wizardry in 
retrieving the files of my CNRS papers from my old Mac. 


? History - My MRG Calculations: Some of my MRG calculations (before the MRG 
model in the text was developed) are reported in my Merrill working paper Path Integral 
Continuous Gaussian Bond Option Model (10/86, unpublished). Interest rates fluctuated 
with mean reversion around a classical rate path, determined numerically. Bond drifts 
Ubona and bond volatilities 0,44, including coupons, were calculated. A Green function to 
calculate options was constructed directly in bond-price space, using bond drifts and 
volatilities. An active numerical program existed at Merrill for this hybrid model. 

The analytic solution for the term-structure constraints of the MRG model in the text, 
including convexity corrections, is in my Merrill working paper A Path Integral Gaussian 
Bond Option Model, Discounting, and Put-Call Parity (4/22/87, unpublished, Eqn. 3.7). 

My own independent calculations of the MRG model in the text, building on my 
work above, were finished by early 1989. I wrote my work up in my 1989 CNRS working 
paper (ref.), which forms the basis of this chapter. 


? History - The MRG Model in the Text, the Hull-White Model, Jamshidian’s Work: 

Hull and White are given precedence for the MRG model in the text, now known as the 

Hull-White Model. Jamshidian, then a member of Merrill’s Quantitative Analysis Group 
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Introduction to this Chapter 


The valuation of fixed-income derivative products, many of which we have 
discussed in previous chapters, is needed for pricing and risk management. A 
basic issue is to incorporate information about other financial instruments, 
notably zero-coupon bond prices (term-structure constraints), into the models. 
This chapter presents some results of a research program that was carried out in 
1986-1989. The approach used (Feynman path integrals) is a general method for 
solving options problems relevant for practical financial applications. 

In the previous chapter, path-integral techniques" were introduced. As a 
pedagogical introduction, solutions to some standard options models were 
presented, along with some extensions. This chapter is devoted to a discussion of 
interest-rate options. We discuss some one-factor short-term interest-rate models, 
emphasizing consistency with initial, or what here are called "static", term- 
structure constraints. Some references are at the end of this chapter" and Ch. 42. 

The simplest special case that has no mean reversion is the constant-volatility 
continuous Gaussian limit of the binomial model of Ho and Lee *. 

The next simplest case is the mean-reverting Gaussian (MRG). The mean 
reversion occurs in fluctuations around a classical path of rates, containing a 
volatility-dependent term*. 

Some general results for arbitrary rate-dependent volatilities are also given. A 
simple case that is not Gaussian is the standard lognormal model. 

The most general Gaussian model” has a structure that includes mean 
reversion as a special case", and it can incorporate memory effects. Memory 
means that the interest rate at one time is correlated to the rate at another time in 
a fashion that can be specified through parameters in the model. 


Other Term-Structure Interest-Rate Models 


Many interest-rate term-structure models exist (see refs.“"). This book cannot 
attempt to deal with a catalog of models, which would require a large volume. 
We only make a short list with few remarks, and give references for the 
interested reader. A number of excellent textbooks deal with these other models. 
Our purpose here is to discuss one model in detail, with my calculations. 


• Тһе single-factor ВОТ (Black-Derman-Toy) model" not only incorporates 
term-structure rate constraints, but also takes a significant further step to fit 
bond-yield volatilities through the term structure of the short rate volatility’. 


that I managed, independently derived the same model. My independent calculations 
were finished somewhat after Jamshidian’s. 


^ Why Use the Name Classical Path? It is natural to call the path around which rate 
fluctuations occur the classical rate path because in physics the state around which 
stochastic fluctuations occurs generally corresponds to the classical limit. In finance, this 
classical path is related to the forward interest rates, up to a convexity correction. 
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e Тһе CIR (Cox-Ingersoll-Ross) model * contains a square-root of the short- 
term rate in the volatility. The CIR model has an analytic modified Bessel- 
function solution for the Green function. See the footnote? for the connection 
with path integrals. 

e Heath, Jarrow, and Morton (HJM) " generalized to multiple factors, including 
analytic solution of the term-structure constraints in Gaussian forward-rate 
models’. ! 

e “Market models""" were constructed by Brace, Gatarek, and Musiela. See 
also the work of Jamshidian, and Andersen and Andreasen. 

e Duffie and collaborators "" constructed a general class of “affine models". 

e  Flesaker and Hughston specified processes for discount bonds, with 
guaranteed positive interest rates". 

e  Hughston constructed a general differential geometry framework *”. 


I. Path Integrals: Review 


For those readers starting the book with this chapter, we repeat a quick review of 
path integrals as applied to finance. As was stated in Ch. 41, path integration has 
many attractive features. First, the picturesque idea of future interest-rate paths is 
an explicit feature of the approach. Second, the fundamental idea of a contingent 
claim being equal to the expectation (with respect to the appropriate probability 
density function) of the discounted terminal value consistent with boundary 
constraints, is used throughout. Third, a complicated mathematical formalism is 
avoided in practice: one only has to perform multiple Gaussian integrations for 
those problems that have closed-form solutions. Sophisticated measure theory 
issues are incorporated in a straightforward manner using distributions (Dirac 


? Describing the Dynamical Statistics of the Yield Curve: The Macro-Micro model 
(Ch. 47-51), which as a multifactor model tackles the even more complex description of 
the volatilities of yield-curve shapes, has a philosophy similar to the BDT model’s 
recognition of the importance of describing yield dynamical statistics, in addition to the 
usual term-structure constraints. 


* CIR Model and Path Integrals: The solution of the CIR model can be found in a 
straightforward manner using path integrals by temporarily introducing artificial extra 
"angular dimensions" for the interest rate, and using existing path-integral results in polar 
co-ordinates (see Ref. x). It is much harder to work directly without these extra 
dimensions since a variety of technical difficulties appear. However I derived results 
without extra dimensions in 1993-94, starting with the Gaussian measure in the CIR 
model and after direct calculations ending with a small dt approximation to the Bessel 
Green function. In order to get nonleading terms in the asymptotic expansion of the 
Bessel function and also avoid singular behavior at r — 0, a "counter term" is needed. It 
would take us too far afield to give the details. 


7 Path Integrals in Multiple Dimensions: Ch. 45 considers dynamics in multiple 
dimensions in the language of path integrals. 
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delta functions)" , thus connecting the stochastic calculus with path integration. 
Fourth, the no-arbitrage conditions are implemented simply by specifying drift 
parameters in the path integral through external constraints. Consistency is 
obtained with the standard no-arbitrage hedging recipes. Fifth, the connections of 
the path integral approach with binomial (or multinomial) discretizations, grid 
discretizations, and Monte-Carlo simulation methods are straightforward. 
Moreover, discretized path integration can produce efficient numerical 
techniques. Finally, path integration 1s easy to generalize to several dimensions to 
treat multi-factor models. 

We emphasize again that path integrals form a general methodology that is 
consistent with and explicitly uses stochastic equations. The connection of path 
integrals with stochastic equations is direct. This demonstration has been given in 
several previous chapters, and it is repeated here for interest rates in. App. A. 


The Rest of this Chapter 


The rest of this chapter is organized as follows. Section II contains a discussion 
of the Green function and static term-structure constraints for discrete-time 
Gaussian models with a time-dependent volatility. Section III exhibits the 
continuous-time limit of the constant-volatility Gaussian model, which is the 
continuous limit of the Ho-Lee model. Section IV treats mean-reverting Gaussian 
models with static term-structure constraints. Results for embedded options and 
caps are given. Section V treats the most general Gaussian model, containing all 
others as special cases, and including the capability of dealing with memory. 
Section VI has a summary and outlook. Appendix A treats details of the mean- 
reverting Gaussian model along with the relation of the path integral and 
stochastic equation formalisms. The inclusion of coupons is also given. Appendix 
B treats general volatility models. Appendix C contains a description of the 
general Gaussian model with memory. 


II. The Green Function; Discretized Gaussian Models 


We begin with the Green function G(r, , r, ; t, ,t, ) for propagation back from 
time f, where the short-term interest rate is called 7, , to an earlier time f£, with 
short-term interest rate 7, , and we set Ta — f, —t, > 0. As described in Ref. 1 
and in Appendix A, the time interval is discretized into intervals of length 


dt = —At > 0, thus defining times { t, |. The short-term interest rate at each £, 
is called y, = r( і.) . The formalism in this section will contain а time- 
dependent volatility с, =o(t, ) and drift ш = u(t, ) . We assume that the 


interest-rate process is normal, or Gaussian. 


Chapter 43: Path Integrals and Options II: Interest Rates 633 


Appendix B deals with the case of a general rate-dependent “local” volatility 
function  , o [r(£),1]. 

As shown explicitly in Appendix A, the general expression for the Green 
function is just the convolution of products of Gaussian conditional probability 
density functions times discount factors, one for each interval. The result is the 
conditional probability density function with the appropriate discounting 
necessary to obtain the expected discounted value for contingent claims?, 


G(r, .%3t, 4, )= 


a 


2 
ES +оо р E ру. TIN .A 
H f [лоза] ар r, At Е fii ; a d (43.1) 
frat) o  j=a -20; At 


The reader will note that the exponent is a quadratic form in each of the 
integration variables r,. In order to evaluate the expression for G, we merely 


complete the square in each r, variable successively and then perform the 
corresponding integration. This is done by repeatedly using the following identity 


2 
x—x'- u At 
exp 4 kx At | g a | = 
—20° At 
Ра 2 
x=x' д At | 
КОЛЛ E oes лы ue 
p| (AY (и $07 AC) Jexp ET (43.2) 
Here, ЙД = u—ko M . 
Carrying out the algebra produces the desired result, 
G(r, 1.56) = 
= 2 
~ = Z^ =r THa T, 
[27 Ei] ехр іс, -r T,, | exp don AT] = Е ‚| (43.3) 
O ab ab 


8 Local rate vol: This was an early use of local volatility for interest rates. 


? Notation for Labels and Indices in Path Integrals — please README! As explained 
in Ch. 42, the extension of a dummy index labeled j in the first product is intended to 
extend only locally in the formula only up to the second product with a separate dummy 
index which can be labeled by the same letter j. 
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Here, the quantities o7, ‚дь апа б are defined by 


b-1 
of, = =) oe Ar SO (43.4) 
ј=а 
b-1 
AT, = У | uj At-Ci+1)o7 (А) |-a65 T, At (43.5) 
Ј=а 
b 
C, = D GHD at [а At -5G +1) 05 (AN | (43.6) 


Ј=а 


Quantities like б: То, Which will appear are defined by taking t, > f, (today) 
and f, — f, in Eqns. (43.4), (43.5), and (43.6). 

We now evaluate the expectation of the discounted terminal value for a zero 
coupon bond P'^! ( r,,t,)attime f, where again the short-term interest rate is 
denoted as м . The zero matures at date  /, where its value 
Р‘ ( 7,1) = 1, independent of the short-term rate 7,. The result is 


+00 
P (p t.) Í GO sh a PCR) 


—00 


= exp | C = | 5 E А n |Z, + 2 б D | (43.7) 


Now the only unknowns are the drifts 4, once the volatilities с, are given. 


In order to obtain them, we shall employ the static term-structure constraints. The 
word "static" is there because (as we shall see in a moment) there are exactly 
enough drifts in order to constrain the theory to produce the zero-coupon bond 
prices exactly at one given time that we can call "today", ty: The caret is there in 
order to indicate that this is a special time'?. The interest rate at f, will be called 
5 . Note that once this procedure of drift-determination is carried out for given 
volatilities, the theory is specified since there are no more parameters. Zero- 
coupon bonds at all times ѓ > f, are then given by the above formula. 
The static term-structure constraints are obtained as illustrated in Fig. 1 at the 
end of this chapter. The maturity T -axis is partitioned into intervals of the same 
length dt = bua t= — At as the length of the intervals into which the future 


10 Notation: In other parts of this book, “today” is just denoted as tọ without the caret. 
The caret corresponds to the notation of the original paper. 
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time axis is partitioned. Denote the logarithm of today's zero-coupon bond price 
(bra 2 3 
In P (À , f, ) by L,, ie. 


Ls] лу pum.) (43.8) 


In the first time interval ( f, ‚4 ), Ñ is the short-term interest rate and 44, is 
the drift. The shortest-term bond maturing at date й is specified by Р, аѕ 


L, = f, At. The drift р, over the first time interval is actually determined from 
the second bond  Pí^? (7,,%) maturing at date t by 
L, = — цо (My + 25, At- 10, (At)’. The drift д over the second time 


interval cancels out in this expression due to the second bond maturity condition. 
Instead, 44 is determined from the third bond po» (Ê, Ê ) which involves 


both 44 and 4o, the latter already being determined. Proceeding successively in 
this way, the general solution for the drift Hj is obtained as 


1—1 


-1 < 2 2 
eos |La -2L4 +L, J> o M -4o At (89) 


Note that the first term is the negative of the second finite-difference 
numerical derivative of L (f,,4)=nP“ (А, й) with respect to 
maturity ¢, evaluated at t= t. Formally, we let the time interval shrink to zero 


to obtain the complete drift function 
With the drifts determined, the general result for the zero-coupon bond at 
time f, in the discretized theory is 


PMO y 4) = 


ЄЛ Е L, ) 1 
At и 


exp -r T, +(1,-1)+Т, СЕ (43.10) 


0a ab 


Note that only the volatility at times between f, and f, occur in бу, : 
The Green function G(r, , r, ; t, ,t, ) can be written in terms of the zero- 


coupon bond P" (к, f, ). The result is 


a 
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2 
pl) r 5, Г. -r — 4, T, 
G(r r 3t, у= (>t) exp | = : 


а? b? S 1/2 
[27 67, P. 207 Ts 


Here, po T, = Hu 9 Т, ‚ апа o. is given above. 

Given the Green function G(r, , 7, ; f, ,f, ) , a European contingent claim 
C can be obtained by an expectation integration of the Green function times the 
terminal intrinsic value. Many examples have been treated earlier in the book, 
and some further examples are considered in later sections of this chapter. 


III. The Continuous-Time Gaussian Limit 


In this section, we take the continuous limit of the constant-volatility special case 
of the model developed in Section II. The result is the continuous limit of the Ho- 
Lee binomial model that (with an appropriate constraint among their parameters 
7,Ó, At) is Gaussian. 

The continuous limit is straightforward to carry out in the path integral 
formalism. One merely lets A? approach zero and then drop all terms that go to 
zero in that limit. Nothing 1s singular. The limit of the logarithm of the bond price 
at А is related to the integral of ће forward interest rate f (t') , determined at 


the initial time, for /' between 1, and ¢, namely 
t 
LO(£&,f)-2lh PO (&,£)- -[ f(t") dt! (43.12) 
fo 


Referring to the first figure in Ch. 7, the notation here 15 
f= UG =ї АТ = o) E The forward rate here is "instantaneous", 


meaning AT = 0. The forward rate at | is just Es The drift function u(t) is 
2 t 


ut) - 5, [n P? (5,4) |+ o? (t')dt' 


10 


Of (t) | "T 
кш o? (t') dt (43.13) 


If the volatility is assumed constant, this becomes simply 
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д і r 
ut) = сап + о? (t-f) (43.14) 
The Green function becomes 
Р\ ty ) ( r t ) 2 
а?а Xab 
G(r ,n;t ,t,) = ——— e = oO 43.15 
(7, n, а? ) [2 o? "5 Xp 2 о? Ta ( ) 
Here defining 7, = t,- A , 


2 2 2 
A» = (ел) HO 5 Te |) 1949 
The zero-coupon bond price in the continuous limit for constant volatility is 


IUE) = 


ev [-] rua -T, (т -f-o |, T | (43.17) 


Here, the continuous-limit integral of the forward rates is just the difference of 
the logarithms of initial bond prices, 


-[ fua" = L-L (43.18) 


European Bond Option 


As an example, a European call option at time f written on a zero-coupon bond 
maturing at time Т is given by 


CH= | d*a, r* i t* [PE (r*,t*)-E] (4319) 


Here as usual [P-E], is defined as P— E if that is positive (in the 


money), and otherwise is zero. Since the integral has a Gaussian integrand with 
finite limits, the result is expressible in terms of the usual normal integral; the 
answer is 
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C(R, Б) РОО (5,5 )N |G |-E PO CR )N|@, | (4320) 


Here, defining Т, = t*—f, and Tp =T-t* , 
PO (R.i) 

= 11—020) в оТ, DL 

Е p : И 


1 = 
Ф, = Cle Т Lm (43.21) 


Here, the subscripts 1,2 on Ф, correspond to the +,- subscripts оп 7, . 


Embedded Call Options with Schedules 


Embedded call options have multiple calls. A multiple call schedule is handled as 
described in the previous chapter on path integrals and options. Classes of paths 
are defined, each class corresponding to the possibility of calling the bond at each 
corresponding call date. The critical or “Atlantic path” has to be determined. 
These details are further examined below and in Ch. 44 on numerical applications 
of path integrals. 

An American option is the continuous limit of a call schedule between the 
times when the option can be exercised. For example, a two-call schedule on a 


zero-coupon bond with exercise times t; and t has option value 
С=С) +C. Here C C! corresponds to the possibility of calling the bond 
at time t and C ? corresponds to the possibility of calling the bond at time £ А 
Explicitly, 


(rs 3j 
C (5,5) = 
f, +оо 
ж * ^ * ^ * * ж * * T * * 
| dg (ОС а аи om id ve) У), 


—oo n 


(43.22) 
^ ñ A 
QOO ye | ае ih Gae уен |: (4525) 


Here, 7, and Ê are the points of the discrete Atlantic path. The point 7, is given 
by the intrinsic value being zero at the second strike or exercise date: 
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PX b) e Ж. (43.24) 


The equality of the first intrinsic value and the option to exercise at the 
second strike date, all evaluated at the first strike or exercise date, gives Ê , 


* * 


f; 
| (Жл ; $6) PPC, ,6)-Е,| =P (st Hk 
(43.25) 


So far, the formalism has been for zero-coupon bonds. Adding coupons is 
straightforward provided the coupons do not depend on the interest rate. One 
simply takes the appropriate linear sum of zero-coupon bonds weighted by the 


coupons to obtain the forward bonds pe (r,t )} for European options; 


p ( n З А ) and р ( А ; 5 ) for the two-option problem. The coupons then 
appear in the formulas for the Atlantic-path points. Appendix A gives the details. 


IV. Mean-Reverting Gaussian (MRG) Models 


In this section we show how mean reversion [vii], can be incorporated into static 
term-structure option models. A review of the mean reversion formalism is given 


in Appendix A. The mean reversion function a(t) is discretized by 
0, =@(t, ). Actually, the case of most practical interest would be the simplest 


example of constant œ. The limit @— 0 gives back the Gaussian Ho-Lee 
model described in the last section. We shall derive the mean-reverting model 
directly; the comments at the end of Appendix A can also be used to derive it. 

As usual, the main task 1s to derive the Green function. This is simplified by 
defining at time /, the variable x,— x(f,) describing fluctuations of 


(CD) = CD (у 
J 


r, =r (£, ) around the classical interest-rate path r; at time f, 


as introduced in Ref. [vi], 


x, =r- f? (43.26) 


J J J 


Here, x (f, ) = 0. The discretized classical path at time t, is given by 
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j-1 
rD = - $ us (43.27) 


j-0 


(with no sum at 7 = 0). In the continuous limit this becomes 


r(t) — 5 + [аа (43.28) 


fo 


The Drift Function 
The drift function 4 t ) is determined using the procedure discussed in previous 
sections. The result for constant mean reversion and volatility is 
Of(t B oui tee? 
meye 210). 9 e |1-e ^"? | (4329) 
ot 2, 


The Classical Path of Forward Rates with a Correction Term 
The classical path at time £ is the forward rate at maturity ¢ determined at time 
f, with a volatility-dependent correction term, 


Oo -a(t-ly 
r(t) = f(t) + „ее ж чей) (43.30) 

A Useful Formula 
We have the following useful formula 

ү; ‘p о?” 

| V(t) dt =| п) [mgr oT) т, 7(0f,,)| (43.31) 
Here 7; =t, —t; and 

3 -@т l( -2от 
g (or) = —; от+2|е ?* -1 E =] (43.32) 


l! Futures vs. Forwards Convexity Term: This is the origin of the convexity correction 
for this mean-reverting Gaussian model. The expected short rate <r(t)> = r^ (t) is the 
future rate; the convexity correction is defined in Eq. (7.2). 
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The Zero-Coupon Bond 
The zero-coupon bond for constant mean reversion and volatility is given by 


pr si) = 


| eT 2 
exp - [5094-4] iga (v. 7,) 0а а E gue 
la 
(43.33) 


Equivalently we can write 


p Xa ]-e 9I» | (а), 43.34 
(r t) ev| A e ) (r: г.) ( ) 
Неге, 


2 
рі) (r,t) =expy—|r ()аг ao #(®Т„)+ (4335) 


t 
a 


The Green Function from the Path Integral 
The result for the Green function G(x, , x, ; f, , f, ) is given by evaluating the 
path integral. For the discretized case it is given by (again At < 0): 


G (x, ,x,;t, t) = 


2 
Ба ыр E E - (1*0, At)x, | 
Ife | 2ло?А | expir, At = 2 "m (43.36) 


Q^ 


t 
Note that to first order in At we have(1 9, At) ~ e . From a path 


integral perspective, this is the origin of the exponential factor of the mean- 
reversion described in Appendix A. 
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Evaluation of the Green Function 

For simplicity in the following discussion, we use constant mean reversion and 

constant volatility. The general case can also be treated using similar techniques. 
The Green function can be evaluated in at least two ways. The first is, as 

before, to use a recursive method. We just complete the square using 


[ éx—x'- un At] 


e kx At _ 
С -20° At 
2 
x—x'— u At 
exp| Ex Art“ (an? (и.о? Ar) [exp E 2 ji a] 
ё © 26 —20° At 
(43.37) 


2 k 
Here we have defined 0 = u- — 0^ At. 


с 


Simple Harmonic Oscillators and the MRG Model 


Another method was described by Feynman (ref). The problem faced here is 
formally the same as that of forced harmonic oscillator motion with the 


replacement of @ to ~—1-q@. Feynman's formulae can then be used along with 


the addition of "surface terms" needed for the correct normalization". The 
surface terms are the cross terms in the expansion of the action density 


E 
dt ` 


MRG Green Function Result 
The result of carrying out either procedure for the Green function is 


pO) (у , ^ 
G(x,.%,3t,54,)= (7..1, ). zl Ju (43.38) 
| 27 67, P. 26, T, 


Here, 


? Surface Terms: Watch out. Surface terms are usually dropped as being irrelevant, but 
they are needed here. 
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2 
2 E 2 
РІ "ce 01е) | 685) 
09) 
2 
E» 0 -201p 
O b Ta = 2-(1- е | (43.40) 


IV Continued: Numeraires 


A Green function by convention can be separated into the product of two parts, 
and this separation is not unique. Recall that in the proof of Girsanov’s theorem 
in Appendix | of Ch. 42, the key was to write the overall drift arbitrarily as the 


sum of two parts 4 = 44, + 44. The backward Gaussian Green function GY ) 


with drift д over the small negative time interval At = f —t'< 0 factorizes using 
ordinary algebra as (cf. Eq. 42.69): 


GU x, titt) = exp (eco ar) HE GO) (хз) (43.41) 


The reciprocal of the first factor in Eq. (43.41) is called the “numeraire”. The 


(40) 


Green function defined by the second factor С has the same Gaussian form 


as G" ) but with the drift д, replacing 4. 


The Green function multiplied by an ordinary differential dx (not a time 
change) gives the transition probability to start at (x',t') and arrive at (x,t ) 


within a bin of size dx around the point x , and therefore is a measure. 

Note also that the Green function “propagates” information and can be called 
a “propagator”, as was done in Ch. 42. 

In Ch. 42 we discussed dynamics where the interest rate for discounting was 
different from the stochastic variable. For interest rate dynamics, things are more 
complicated because both the discount factor and the measure or transition 
probability part of the Green function both depend on the interest rate. 
Considering Eq. (43.3632), we call the product of the infinitesimal discount 
factors the reciprocal of the “money market numeraire” , viz" 


exp U Ar (43.42) 


At<0 


-1 
(ab) EN 
[N Money Market ] ES 


b-1 


j-a 


P T thank Xipei Yang for helpful discussions on numeraires and many other topics. 
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The Forward Numeraire and Measure 


After some algebra as described above, we obtained the zero-coupon bond as an 
explicit factor in the Green function Eq. (43.3834). Following convention we call 
this factor the inverse of the “forward numeraire", viz 


[NE „| =P (rut) (43.43) 


Forward 


The Gaussian in Eq. (43.3834) multiplied by ordinary calculus differential 
dx is called the “forward measure"". 


Note that there is a corresponding drift in the Gaussian y,„, in Eq. (43.3935). 


Another Expression for the MRG Green Function 


Suppose we introduce a later arbitrary time ¢, with f, >t, >t. Using the zero- 


coupon bond formulae and Eq. (43.3834), after some algebra the MRG Green 
function can be written as 


pu) p? 
GCX star y= = = 7; exp TE 
p Cr, ot, )| 2л 67, 75] 0, Ta 


(43.44) 
where 


a 


2 
-of, 
ES E x ”x + A (43.45) 


2 _ 2 2 T _ 
| l-e id tala efi- e v (43.46) 
e 0 


The drift 7 p, picks up an extra term relative to the drift in y ,. The reciprocal 
of the corresponding numeraire is the factor multiplying the Gaussian, viz 


" History: I derived the forward measure that I believe was contemporaneous with other 
authors; I wrote it up later in my 1989 CNRS preprint (ref 1). 
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E pU» t 
eal = а (43.47) 
“UN, 5b, 


Note that although the Green function is independent of the arbitrary time f,, 
both the numeraire and the drift in the Gaussian on the right hand side of Eq. 
(43.44) depend on 1,; this dependence cancels. Therefore it might seem that this 
has been a useless exercise, but it is important because the arbitrary nature of the 
choice of numeraire can impact calculations of a regulatory requirement, credit 
PFE (Potential Future Exposure) in a risk-neutral setting where numeraires play 
an important role, as we consider next. 


Stein's Observation: PFE, Monte Carlo Simulations, and Numeraires 


Recall again that two stochastic equations, identical except for different drifts, 
define two Green functions proportional up to a factor, the numeraire. If the two 
stochastic equations #1 and #2 are used to generate paths in two Monte Carlo 
simulators MC#1 and MC#2 that use the same random numbers and volatilities, 


and start at the same initial point ү by i, the paths from these two simulators will 
arrive with different probability densities at a given future point (71 3 within a 


bin dr, (ordinary calculus differential). The probabilities are different because 


they do not include the numeraires and the drifts are different. Because the path 
densities are different, the points corresponding to a given confidence level in 


rates, say her = 95%, will not be the same in the two MC simulators, viz 


| 77959 (43.48) 


( À-9. 5%) 
£ Г, 
Stochastic Eqn. #1 


doses Eqn. #2 


For example, suppose we generate 100 paths in each simulator using the 
same random numbers and the same volatilities. The 5" highest rate points at the 
future time f£, will be different in the two simulators. A security calculated at 


these two different points for the two simulators will also differ. Therefore the 
risk at the 95% CL, e.g. the PFE, will be different for the two simulators. 

For this reason any calculation that relies on a future confidence level will be 
ambiguous because of the arbitrary choice of how the numeraire is defined. 

As one example, paths generated using the measure as the second factor in 
Eq. (43.44) depends on the arbitrary (and physically irrelevant) time £,, as we 


just saw explicitly above. 
Using arguments like these, Stein showed that credit PFE (Potential Future 
Exposure) is ill defined. Indeed Stein showed that ny pre-specified value of PFE 
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can be obtained with a corresponding numeraire in a risk-neutral setting. PFEs 
are substantially numerically different even for different standard 
numeraires ^". 

Stein's result is similar in spirit to the differences in FX Monte Carlo 
simulations using the two different drifts corresponding to the two different FX 
definition conventions (see Ch. 5, “The Two Country Paradox’). 


Smart Monte Carlo Preview 


A numeraire can be chosen for a MC simulation using a drift that is designed to 
sample some region of space in the future with high probability, tailored to the 
payoff of some instrument. This is a type of importance sampling, and is a special 
case of "Smart Monte Carlo". 


Notation: The Green function with variable x and short rate r 


Inserting x(t)= r(t)- © (t) into the expression for the Green function 


G ( x, , X, ; t, , f, ) re-expresses the results in terms of r(t) . We call the result 
G ( r, ,r, ;t, , f, ), keeping the same name С for simplicity. 


Notation: Connection with Hull-White (HW) 


Hull and White (HW) use the following notation, assuming constant mean 
reversion and volatility: 


dr(t)- | 6(t)- ar(t) dt + odW (t) (43.49) 


The connection of HW’s mean reversion @ and volatility с with our notation is 
а= 0 and о = с. With arbitrary v(t) we also have 


dr (t) 


ш: 


t or? (t)-ov(t) (43.50) 
a-sort- e 7 cer (43.51) 


dW (t)=n(t)dt+v(t)dt (43.52) 


Н. Stein, private communication (2013). 
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European Bond Option in the MRG Model 


Armed with our result for the Green function we are in a position to calculate 
some contingent claims. The European call option is defined as usual by 


C. hm [dr* GS rti ur [PEE] (3.53) 


—o0 


C is given by the same form as before, 
C(&,&)- P? (Жууш |-EP^(5.,5)N[|o,] (3.54) 


The mean-reverting forms for the arguments of the normal integral are 


Ф. = + (43.55) 


Here, the subscripts 1, 2 оп Ф correspond to the + subscripts on =, and 


| + Sr T |1- exp(- oT.) | (43.56) 


+ жул л 2 

i ЕР EJ 2@ 
Finally, бу. is б, with f, , t, replaced by n ot As before, D, = t*- А i 
and Tp =T —t* 


Caplet in the MRG Model 


As another example, consider a caplet with threshold strike rate Е. The caplet is 
a single option, whose value at the strike date ¢* is defined by the difference of 
the interest rate and Æ , provided that is positive. Thus, in ће MRG model, 


+оо 
Capt (5,5 )e | dr* G(5,r*; hot) [7*-E ], = 


peo) Gin Vl oy | У утв — 


(43.57) 
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Here, f*= f (t *) is the forward rate at ¢* and the other quantities are as 


above. The actual caplet as quoted in the market will be this result multiplied by a 
time interval between resets Dt In addition, other details involve the cap paid 


reset * 


“in arrears” at time Dt, later than t*. A market cap is actually a sum of caplets 


reset 
with different reset dates. There is also a notional principal amount. Caps are 
usually priced with lognormal rate assumptions, but sometimes they are priced 
with Gaussian models and sometimes with a mix of lognormal and Gaussian. 
Practical information and more details on caps can be found in Ch. 10. Finally, 
please see “Gaussian into Lognormal Using a Simple Trick” in Chapter 20. 


Negative Rates in Gaussian Models 


For Gaussian models, rates can become negative. Since this is generally 
unphysical, Gaussian models were criticized. Recently some rates went negative, 
and Gaussian rate models made a comeback. The probability of negative rates 
depends on the time from the start, the volatility, and the starting rate. 

For example, the value of a floor with a strike of zero with positive rates 
should be exactly zero'Ó. This is because the floor only pays off if the rate 
becomes lower than the strike. On the other hand, floors with zero strikes are not 
without value in Gaussian rate models due to negative rates. 

The extent to which negative rates influence the pricing of other securities is 
not easy to extract. 


V. The Most General Model with Memory 


In this section, we present the most general one-factor model possible. A 
Gaussian version Ref. [vi] was originally proposed, and we discuss this first. All 
other Gaussian models discussed in this paper are special cases. In this general 
model, the exponent in the integrand for the Green function contains correlations 
with separate (zero or non-zero) coefficients between the interest rates at any two 
times, with arbitrary weights". The discretized Green function can be written as 


G(x, , x,;t, .t,) = 


b-] +% b-1 b 
Qu TL | ае 3r M- Y x А, х, (AD? (43.58) 
ј=а+1 оо j-a k,m=a 


15 Zero-Strike Floor: The example of the zero-strike floor was first made by C. Rogers. 


17 Constraint: The matrix (Axm) needs to be symmetric positive-definite. 
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Here, А,, is a number for given indices k,m. Also, (2 „ is the normalization 
factor, given by 


b 


+оо b 
1 = O,[[[ as, ap] - У x, Am Xn (At)? | (43.59) 


ј=а+1 —oo k,m-a 


With this definition, the Green function still represents the conditional 
probability density function multiplied by the discount factors needed to produce 


expected discounted values for contingent claims. The factor (Af) is written for 


convenience; the matrix A is then dimensionless. As before, x(t) is the 


difference between r(f) and the classical path x(t). 


Applications of Models with Memory 


Some attempt was made in 1986-1987 to determine parameters in this Gaussian 
model to fit term-structure constraints and other market data including both the 
matrix A and the classical path. Because the number of parameters was not 
sufficiently restricted, the general form of the model proved to be difficult to 
implement in practice. 

We feel the wide range of market phenomena potentially describable may 
well justify further effort. In particular, an "effective" memory can be created by 
stochastic variables that are not normally, included in options pricing (see Ref. 
[iii], and the remarks at the end of this section). Notice that the matrix element 
А indeed connects the rates 7, and r, at times f, and f, . This incorporates 


m 


memory effects of the interest rate process with itself. 


Special Cases 


The special cases treated in previous sections are given explicitly by specific 
choices of the matrix A . For the Gaussian model, the connection is 


Aceon: (At)? = + б, мл ==2 б,» | (43.60) 


тоз! $, sau 


Here, Ó. is the Kronecker delta, equal to one if r = s and zero otherwise. Thus, 


rs 
only nearest neighbor times are connected by the matrix A. For the constant 
mean-reverting Gaussian (MRG) model, the result is 


2 
A (до)? = + Ó, m- -2 Ó, m Je (43.61) 


cum 


csl аи 
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This contains the extra term at k = m. 

Contingent claims can now evaluated by the usual procedure of taking 
expectation values. Again, for those problems reducible to iterated Gaussian 
integrals, closed form solutions can be obtained. We exhibit the calculation of a 
zero-coupon bond in Appendix C. 

As mentioned, А, was assumed a constant in Ref. [vi]. However, a-priori 


A 


ım Сап be a function of time and rates at various times, 


An = Ae (43.62) 


This provides the most general one-factor model, including memory. Special 
cases include the general-volatility models in Appendix B. 


Connection of the Path Integral with the Stochastic Equations 

We now give a description of the connection to the stochastic Langevin-Ito 
equation, using the techniques described in Appendix A. To motivate the 
discussion, write the finite-difference stochastic equation for the Gaussian model, 


dx(t) | XX; 
—— = == = ол. 43.63 
| d | dt д. о 
This produces (with dt =—At again), 
1 aussian 
==” =й, Oy уте E (43.64) 


с, At 


This defines the matrix Ce In Eqn. (43.64) we use the summation 


convention for the repeated index К. 
Now we could simply write, for a general matrix C , the equation 


n, = | -2At Cy X, (43.65) 


Defining the matrix А = C' C as the transpose of C times C we find that 
the probability density function for the 7]; variable becomes (with the summation 


convention for indices k,m) 
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7; At 


exp = exp | — C e Xp Cim Xm (At)? | 


(43.66) 
= exp |- ху EE (At)? | 


This is exactly what is required to produce the general Gaussian model. 


Stochastic Equations with Memory 


Note that if we write out the general Langevin-Ito equation explicitly (this time 
without the summation convention) we obtain? 


1; 
= Суху + [Ci Хы + Cua ху |+ È — m (43.67) 


4/2 dt i<—2 


The first term contains mean reversion while the "nearest-neighbor" terms in 
the bracket are in the Gaussian model Eqn. (43.64). The sum contains potential 
memory terms, not in the previous models, and corresponding to effects of the 


noise at time t, on the variables x(t) at other times. 


Physical Intuition for Memory Effects 


Physically it is easy to see how memory can occur. Imagine a particle diffusing in 
a medium that is kicked at some time by an "invisible gremlin". The future 
trajectory of the particle will remember this kick at all later times; this is the 
memory effect. In the case of the currency option with stochastic interest rates, 
for example, the kicks are due to fluctuations in the interest rate processes and the 
variable left in the description is the exchange rate. 

It is easy to invent a simple 2-factor model!’ to exhibit how this effective 
memory is generated. We start with two random variables, x and q, satisfying 


cet = Ax(t) + Bq(t) + ont) (43.68) 
“ш = Cq( + Ex) + af (43.69) 


'8 ARIMA Models: The formalism is related to ARIMA models. See Ch. 52. 


? Acknowledgements: This example was applied long ago in different contexts by 
Lindenberg and С. West. It is a special case of the "heat-bath", clearly exposed in 
Feynman and Hibbs (cf. Ref. i11, Pp. 68 ff.) 
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Here, o(f),¢(t) are two correlated Gaussian random variables. In this two- 
dimensional formalism where both x and q are explicit, there are no memory 


effects. Now, however, integrate out g(t) by using the solution to Eqn. (43.69) 
with initial condition 4, = q(0), 


qt) = е, + [ й' e°“ [E x(t) + a ct) | (43.70) 
0 


Plugging that into Eqn. (43.68) we find 


dx(t) 


5 caseo M+ BEG +B | dte? [E x(t) +a E(t) (43.71) 


Explicit memory effects are present in Eqn. (43.71) for x(t) with q(t) 


removed (i.e. the "gremlin" variable q has been made "invisible"). That is, x at 
time ¢ depends on x at previous times due to the mutual interaction of the x 
and q variables in the original equations (43.68), (43.69). 


VI. Wrap-Up for this Chapter 


This chapter has dealt with some term-structure one-factor models. A variety of 
models was considered, including mean reversion and even memory effects. 

What can be improved in the current generation of one-factor options 
models? First, let us step back and ask what has actually been accomplished. The 
static term-structure constraints have been included. Still, improvement is 
possible. For example, there is no guarantee at all that the term-structure time- 
averaged statistical properties of the yield curve, including its fluctuations in 
shape, will be correctly produced in accord with market observations”. 

As a concrete example of the importance of this remark, mortgage-backed 
securities require the correct spread statistics between short and long term rates in 
order to obtain correct prepayment model input. Simple one-factor models do not 
generate realistic spread statistics, and are therefore not optimal. On the other 
hand, for embedded options in corporate bonds that are "weak probes" of the 
actual interest rate process, these models can be quite useful as a parameter- 
dependent characterization of market data at a certain time from which small 


? Remark: Although this point was made in 1989, it is still true today. See the chapters 
on the Macro-Micro model (Ch. 48-51). 
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perturbations are made close to that time to price bonds in normal trading and 
sales activities. 

Large effects, e.g. market "crashes", are excluded from all most models 
constructed so far. We have always believed that a description of crashes requires 
non-linear phase-transition physics between multiple equilibria. See Ch. 46. 


Appendix A: MRG Formalism, Stochastic Equations, Etc. 


A.1 Relation of the MRG Path Integral to Stochastic Equations 


We first present the formalism of mean reversion in Gaussian models using the 
path-integral language appropriate to this paper, and we connect it to the 
Langevin-Ito stochastic equation formalism. 

We begin with the Langevin-Ito equation defining ће МЕС model”’, 


z - i E o(t)n(t) (43.72) 


We write as before x(t)=r(t)—r (t) to get the fluctuations of the 


interest rate about the classical path. 
As described in Ch. 42, o(t) 7f) has the interpretation of the random slope 


of the path from time ź to time ¢+ df , given x(t) at time £ , and accounting for 


the mean reversion between times £ and ¢ + dt. The idea is illustrated in Figure 
2 at the end of this chapter. 
Now y(t) is assumed a Gaussian random variable with zero mean and width 


А 1/2 . "E 
(n (0 ) =1/4dt , where the expectation value is with respect to the 
probability density for 7(¢), 


e[no]anu) = mir exp[ La (t) dt | (43.73) 


The time interval (¢,,f,) is discretized into n+l points f, with 
J=[4,a+1,..,(a@+n=5)] and interval dt=t,,,-—t, 2— At» 0. For 


notational convenience, we also define an index i= j - a 2[0,1,...,n ]. The 


?! Notation: The left hand side of this equation is [x(t+dt) — x(t)]/dt, which is the same as 
what we called d,x(t)/dt in other parts of the book. 
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variable x(t) is discretized to х, with the endpoints x, = X;,9, and x, = Xi, 
fixed. The total probability conditioned on the endpoint constraints is the product 
of the probability densities at times i= [Оул -1]. However, with the two 


endpoints constrained, only variable slopes at i = [0,..,.n - 2] are integrated 


over; the last slope with і = и —1 is then specified once X;;.,, цу and Xj; ,,, are 
given. 
Using this discretization and again recognizing that the endpoints are fixed 


we obtain” 


s n-2 +0 
1 P tot {n} Dij = P (hionn) : П 1 p ( 7], )dn, 
—0 i=j-a=0 ~% 
n-2 + dn 
= (maa) un J 2m l-imda| (43.74) 


We multiply this equation by the Dirac delta-function identity 


n-2 +0 
| = fd [xí 7x, (17, dt) - o, ndt | (43.75) 


i-j-a-0 — 
Because the last point is fixed at x, we need to fix the last slope as 
Nie = [xx (1-0,4) ]/ (с, , dt). Now in fact we are not 
interested in the 17; variables since we want the paths to be specified by the х; 


variables. The Dirac 6-functions arrange this by killing the 77, integrals and 


substituting the expressions for the 7]; found in the 6-function arguments. We 


п-1 
; . -r;dt . i : 
insert the discount product factor | | e ^ іп е integrals, as is necessary 


i-j-a-0 
for the expected discounted expression. We need to use the formula (Ref. [xvin]) 
ôy -7 ) 
= 43.76 
Afo) if (43.76) 
dy 


? Comment: We could proceed by including integrals over all slopes nj up to j = n-l, 
inserting one more delta function and then take away the last dx, integral to account for 
the fixed point xy. I thank Andrew Kavalov for a clarifying discussion on this point. 
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Неге, f(y )=0. Taking the mean reversion as a constant then yields the 


formula for the Green function for the mean-reverting Gaussian model, Eqn. 
(43.36). 

The zero-coupon bond price P' ^! ( r, , t, ) with maturity date /,, evaluated 
at time f£, with interest rate fixed at one possible value 7, can be obtained by 
Gaussian integration using the formula, all with 7, fixed, 


ty 
POC у= exp - [rto а = 


t 


a 


1, 


= ехр| – [ ( r(t) )dt+4 | dt j dt C rr’) ). (43.77) 


t t t 


a a a 


This generalizes (exp Е (0) | ) = exp E r(t) )+ HOF ) | at fixed time. 


Here the second-order correlation function is defined as usual, 


(ур ) = (yy, )- (y Y», ) (43.78) 


The expectation values are with respect to the Gaussian measure without the 
discount factors. The average (r(t)) = (x(t)) +r (f) along with 


(r(t) r(t )), = (x(t) x(t )), , all with r, fixed, are evaluated from the solution 
of the Langevin-Ito equation for constant mean reversion, 
dx(t) 
dt 


=-ox(t)+o(t)n(t) (43.79) 
Given x, at f£, we have 


x(f)ex gw 
a 


i j e" €? c(£) n(£) dé (43.80) 


Hence 


GU exe? (43.81) 
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If further o(€) =o is taken constant for simplicity we can use the identities 


(nO) =0, (N, yt, )) = ó(t, — t, ) to obtain the results 


2 
(x(t) х@„)), = ole ҺЫ _ pre (43.82) 


Also, with Z^ given by Eqn. (43.32), we have 


fai dt (x(t) x(t’). = a 3 g (oTp) (43.83) 


t t 
a a 


Some algebra then leads to the results in the text. 
For general time-dependent mean reversion, we define a modified volatility 
function o; (f) by 


t 
c. (f) = exp | [ o@yar | c (t) (43.84) 
io 
Then the variable 
t 
х„@ = exp | [ oar | x(i) (43.85) 
io 
after a little algebra is seen to satisfy the equation without mean reversion, 
dx, (t 
ZW o (t) n) (43.86) 


The time-dependent volatility o,,(t) can be handled as described in the text. 


A.2 The MRG Diffusion Equation 


The diffusion equation solved by the Green function can be obtained by Fourier 
transform methods Ref. [iv]. From Eqn. (43.36), 


G(x, ,%,5t,5 f) = 


2 
—] +00 e E: | = 1 A : 
b-l [m b-l [22020] a r, At LET A о = | (43.87) 
| j 7 j 
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We rewrite the exponential at fixed co-ordinates and again set dt = —At > 0. 
Introducing the Fourier transform variable k ; at each partition time produces 


G(x, ,%3t,.t,)= IT f &, ПЈ p 


In Eqn. (43.88), 1= 3—1. The coefficient of dt in the exponent is the 


eee Ó 
infinitesimal time generator — 


Ot 


for a transition backward from time t тй 


IU. 
Ox(t) 


evaluated at t, according to Fourier prescription. Dropping the subscript j and 


fixed x; 


at position x ja to time t, at position x y We replace 4-1 К, by 


setting r(t) = x(t) + r ^P? (t), the diffusion operator identity at any time f£ is: 


2 e 


à m 
Au 7 Aw 


ôt 


+x +r (т) (43.89) 


fixed x(t) E a(t) x(t) 


This derivation is actually more general. The same equation holds if the 
volatility and mean reversion become functions of x(t) and £. The ordering of 


the spatial derivatives conforms to the backward Kolmogoroff equation; this 
point is a little tricky to see. Simplification occurs by using an integrating factor 
to remove the classical path in the equation propagating back from time f, to f£, 


1, 


DP (1,5) = exp) -| r^? (rd (43.90) 


f, 

D™ is the discount factor produced by the classical path and is very 
intuitive: it 1s just the discounting produced by the average interest rate path 
about which the fluctuations occur. We set 

G[x(D ,x,;:1,5] = DO (4.4) P[x().x,;;,5] (4390 


Equivalently, 


LEID ‚х„;ї,„] = Сега] 


ы (43.92) 
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Then we have 


д 
pudo 2X, 5t sty | feat) 
д 
pe» ( t, , і, ) | ôt fixed x(t) = = ( ) | d | a | K | ' " ү с 


The classical path drops out of the equation for Z”, which is then solved for. 


Since only the integral of r enters in С, the forward-rate dependence is 


eliminated in favor of zero-coupon bond prices. To illustrate, this intermediate 
step of removing the classical path results in the European call option 


CRD? a PISO EET 
pp? Cp aatem. gu (43.94) 


Here, 


q^? Са) = p? (x*;i*) 


izo (43.95) 


Equivalently, defining М = E / D? ( t*,T ). we have 


CCA S) =D (A T) ега m UO (e, e) -N] de 


(43.96) 


We can rewrite the equation in terms of r(t) using x(f) = r(t) - r” (t). 
Since the classical path at fixed time is fixed, the partial spatial derivatives satisfy 
Ô _ 2 
Ox(t) fixed t — ór(t) 


term, however, since fixing x(t) is not the same as fixing r(t) . We have 


. The partial time derivative picks up an extra 


fixed t 


2 ШУ; 


au 2 
а fixed r(t) —— а 


| 43.97 
fixed x(t) dt x(t) | | 


fixed t 
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This relation can be seen by applying this identity to a function of x(t) and ї 
expanded as a double power series in x(t) and ¢ while using 
х@) = nmü-r (0. 


The operator equation expressed in terms of r (t) is then 


5 Јар. ET EN 
EE Б alt) [ 50) – ^q) | 2: 

1-2 e 

lg OF t r(t) (43.98) 


The derivative of the classical path involves differentiating the forward-rate 
function f (t) that in practice is somewhat unstable. Therefore, Eqn. (43.89) can 


be preferable since it involves the classical path itself; the classical discounting 
factor is also useful. 

This completes the discussion of the diffusion equation that is satisfied by the 
Green function. Any European contingent claim also satisfies this equation, since 
it is obtained by convoluting the Green function with its terminal value, 
consistent with the spatial boundary conditions. 


A.3. Inclusion of Coupons for Bond Options in the MRG Model 


To close this appendix, we give the details of how to include coupons for bond 
options in the MRG formalism. The limit of zero mean reversion gives the 
Gaussian model. A European call option on a coupon bond is given by the 
convolution of the Green function with the intrinsic value of the option on the 
forward bond at the exercise date, with strike price Е, 


+оо T 
C= [dr*G(R.r*$.r| У с Р (19) -Е| (43.99) 


t, 2 t* 


In Eqn. (43.99), all coupons c, at times Т, after the exercise date /* to 
maturity date T are included in the forward bond? on which the option is 


? Forward Bond: This is the value of the bond at the forward exercise time, dependent 
on whatever rates occur at the forward time. The coupons in the forward bond are the 
only coupons included, since the previous coupons have been paid before exercise. The 
idea is a little like the forward stock price that has previous dividend payments removed. 
See Ch. 42. The forward bonds are also used for forward CMT rates. See Ch. 10. 
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written. At maturity Т, we also need to include the par amount. The first 
"coupon" after ¢* is the actual coupon reduced by a fraction equal to the time to 
that coupon payment from f * divided by the time between coupon payments. 


Now call n the interest rate at time £ * where the call intrinsic value is zero: 


T 
Po poer еи (43.100) 


tp >t* 


In general, this equation must be solved numerically for n . For r* € (79.77) : 


the intrinsic value is positive since c, > 0. Integration of Eqn. (43.99) then 


produces the European call option on a coupon bond as 
Ch sto) = 
Г 


> Po TN ue (;)]- EP 6,5) N | o, C7 )| 43401) 


tp > t* 


Setting 7, — 1, —1*, Ty - (* — f, , and f *= f(t*) we have 


(43.102) 


The zero-coupon bonds PONE) in Eqn. (43.101) are given by the term 
structure data at the current time f, as usual. If the coupons are removed, the 
equation (43.100) for n can be solved analytically, and the zero-coupon bond 


European option results of the text are reproduced. 

Bermuda options with call schedules and American options are treated as 
mentioned in the text with coupons included for each forward bond as above. The 
numerical back-chaining algorithm must be employed. 
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Appendix B: Rate-Dependent Volatility (Local Vol) Models 


In this appendix, we deal with one-factor models with general volatility 
functions, but without memory effects. The lognormal interest-rate model is a 
special case, as well as the Gaussian models of the text. Combinations of 
lognormal and Gaussian models can also be incorporated”. In order to motivate 
the ideas, consider the Langevin-Ito equation for a function y of the interest rate 


r(t), which we write as ^ 


dy[r(t)] 


p = Lu, (t) - o, (t) (t) (43.103) 


We can rewrite Eqn. (43.103) using the formula 


= 010015 + Ayro ww 


dlr] r 
dt 


dt 


Here, y' and y" are the first and second derivatives of y with respect to r, 


evaluated at r(t) . Now we define 


c[rt)t] = о, AyrA] (43.105) 


ule. = [u0 - Ly) 23 [то (43.106) 


This produces the equivalent Langevin-Ito equation for r (t) as 


drin) 


di = u[r(t),t] *o[r(t)t 1700) (43.107) 


Using this we find an alternative expression for 4, 


^ Lognorm Models: Sometimes linear combinations of Gaussian and lognormal 
processes have been used. They have the imaginative name *Lognorm" models. 


? Acknowledgements: F. Jamshidian recognized early the importance of this sort of 
transformation. I thank him for a helpful conversation on the topic. 
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1 12 
ups] = co 40-1610 


y'o] 
PALOS 


(43.108) 


Eqn. (43.107) gives the equivalence between the Langevin-Ito equation 
(43.103) for y[r(t)] and that for a general volatility function in the Langevin-Ito 
equation for r(t) . The r -drift is specified once the volatility is given along with 
the y -drift. 
The lognormal model is defined by y(t) = F[r(t)] = In[ r(t) T, ]. Here Т, 
is a time scale (e.g. 1 year for annualized interest rates) We get 
2 
oA = AnA and urs] = ALÀ +5 Oy (0 ] the 
usual results. In this special case, y"/ ( у') = —1 is independent of r . 


The path integral for the Green function for general volatility expressed in 
terms of the interest rate is given, as explained in Ref. [iv] and in this chapter, by 
the Langevin-Ito equation (43.107) and the Dirac delta-function procedure 
given in Appendix A. The Langevin-Ito equation is a constraint in the double 
integral at each time of the interest rate and the Langevin-Ito variable. The result 
is 


САСА го) = 


2 
b-l uU b-1 dt dr(t) | 
dr, М. ех r, dt 43.109 
IL] Пав + el à ) '? IM 


-1/2 
Here N, = [2л о; dt | is the appropriate normalization factor, and 
ш = u [r(t; tj]. e; =o[r(t, t; ]. The limits on the integral will 
depend on y. For example, a lognormal distribution has positive rates only. 


We can also write the expression for the Green function in terms of the 
function y. This can be done in either of two ways: (1) rewriting the path 


integral Eqn. (43.109) through a change of variables, or (2) directly using 
Appendix A, starting with the Langevin-Ito equation for y, Eqn. (43.103). We 


write the discretization y, = y[r, ]=y[r(t, )], with r,=r(y, ). The 
lognormal model has r (у) = exp(y) / T,. The Gaussian model has r (y) = y. 


The Green Function for General Volatilities as a Path Integral 
We obtain 
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Gy itt) = 


b-1 Ь-1 dt dy 
dy, |А, - dt "ZU 
I Yj П "n. exp r(y,) 20? (1, ) |2 i H, ( 4 ) 


(43.110) 
In the change of variables in the path integral, the Jacobian enters to preserve 
the unit normalization for the probability density function. The normalization 


factors are N, , = [2л c. (t. Jar = 

This equation is just the convolution of the probability densities expressed in 
terms of y , with the discounting factors also expressed in terms of у. The limits 
on the integrals indicated by R[y] are defined by the lower and upper limits on 
the possible numerical values of y. The function y can be chosen to limit the 
integrations over the interest rate by an infinite derivative y' forcing the 
volatility o, — o [ r(t, ),t; ] to become zero, and thus setting barriers to the 
interest-rate process at the desired limits (7, , 7, ) in Eqn. (43.109). For the 
lognormal process, the lower bound occurs at zero interest rate. 

The static term-structure constraints can be imposed using Eqn. (43.110) by 
choosing the y -drift parameters 44, (t, ) one at a time using the procedure in 


Sect. II. This can be done since ће у -drift depends (by assumption) only on the 


time. This would have to be done numerically as analytic expressions for the 
zero-coupon bond prices do not exist in closed form except for certain cases (e.g. 
Gaussian models). Even for the simple lognormal model, the term-structure 
constraints must be carried out numerically using an iterative technique. 


Appendix C: The General Gaussian Model With Memory 


In this appendix, we exhibit the calculation of a zero-coupon bond in the general 
model of Section V potentially including memory effects, and with constant A, 


m 
matrix elements as introduced in Ref. [vi]. We show two things. First, the term- 
structure constraints may be imposed as for the simpler Gaussian models. 
Second, the logarithm of the bond price is linear in the interest rate, with 
quadratic terms in the interest rate canceling out. 

The zero-coupon bond is defined as usual, 


pues (r, ,t, )= f dr, G( x, » Хр sl, >Í, ) PO (7, d) (43.111) 
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The terminal maturity condition is as usual, P (7, , t, ) 2 1. The calculation 


of the integrals is facilitated by organizing the integrals into those with indices 
k,m = (a-cl),...,(b—1) . We then perform the last integral over м. We 
define a sub-matrix B ofthe matrix A, 


B 


km 


=A,, with [k,m= (a+l),...,(b-])] (43.112) 
We also define the vector б with components 
¢,=1 with [k = (a*D),..,(5b-1)] (43.113) 

We also define vectors Ј, , J, with components 

J, = А At, J, =A, At with [k-(act1)..(b-1)] (43.114) 
Finally, we define a, D , V, , Y, , W, by the relations 
а = А, (At) -J,- B^ -J, 
B=- Ap (ND +J, B^ -J, 
у=: BJ, (43.115) 


V,-6: B^ d, 

y,-6 В 
Here ВГ! is the matrix inverse of B , and the sums on the indices of the vectors 
and of B^! from (a+1) to (b — 1) are implied by the dots. 


After Gaussian integration, the discretized Green function is found to be 
(again At « 0 for backward propagation), 


G (X, X431) = 


(43.116) 


+оо 


Í dx, exp Е: x, + 2 Вх, х, ] 


—o0 


b- exp | -a x; + (28x, – w,) x 
exp p PO Myx, + wef | | E iJ 


j-a 


Performing the final integration, the zero-coupon bond is 
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P? (r,t) = 
b-1 2 

exp? Y, ri^? ми ae (43.117) 
pem a 4a 


Hence, the logarithm of the bond price is indeed linear in the co-ordinate 
(CL) 
r 


x, rtf . Moreover, the drift factor enters in the canonical fashion 


through the classical path; the procedure outlined in Section II for determination 
of the drifts therefore holds. 

Given the general Gaussian form for the Green function Eqn. (43.116), 
contingent claims like options can be found in a straightforward manner, 
consistent with the static term-structure constraints. 


Figure Captions for This Chapter 


Figure 1: Term-Structure Constraints. Determination of the drift function in the 
static term-structure-constrained models. The vertical maturity 7 axis at the 


initial time 5 is partitioned in the same way as in the future-time axis. One by 


one, as shown by the arrows, each drift 4, = u(t А, between future times 
(t, t, ) is determined by the bond at the initial time Â, with maturity date 


jt+2° 


Figure 2: Stochastic Equation Variables. This is a graphical illustration of the 


connection of the Langevin-Ito variable 7, =n(t,) and the co-ordinate 


х; =x(t,) at time ¢, for a given discretization into n time intervals. As 


described in the text, this variable (multiplied by the volatility and corrected for 
mean reversion) describes the slope of the straight line drawn between x, and 


X,„ in one realization. The random slopes produce random paths between the 


fixed endpoints x, at time £, and x, =x 


а+п 


at time t, =t 


atn’ 


Figure 3: The Classical Path. A path (or realization of the random process) for 
the interest rate r(t) as time varies. The random paths fluctuate about the 


classical interest rate path labeled by rl (t). The process between times 


(t,, 1) is discretized into n points labeled as к, №, .. 
f, t,,...,¢, - This drawing is taken from Ref. [vi]. 


. F, at times 
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44. Path Integrals and Options III: Numerical 
(Tech. Index 6/10) 


Summary of this Chapter 


This chapter presents some aspects of numerical methods for options based on 
path-integral techniques. We have already emphasized the connection between 
the binomial algorithm (or any lattice method), Monte-Carlo simulations, and 
path integrals. 

A major topic is the Castresansa-Hogan method for discretizing path 
integrals'. Some simplifying approximations are discussed. An iterative 
procedure based on "call filtering" for Bermuda options leads to a "quasi- 
European" approximation. The idea of “geometric volatility” is introduced. We 
also present an approximation to lognormal dynamics using a mean-reverting 
Gaussian designed to speed up calculations’. 


New material for the 2" Edition? 


e “Smart Monte Carlo” 
e “American Monte Carlo”, within a path integral context 


Introduction to this Chapter 


In previous chapters, the Feynman/Wiener path integral was applied to options in 
a variety of examples. This chapter examines some aspects of numerical 
techniques using path integrals as a base. Castresana and Hogan solved the 
practicality of direct path-integral discretization. This approach will be discussed 
in some detail. All the usual numerical techniques (binomial, multinomial, grid, 
Monte Carlo) are approximations to the path integral. Because standard texts treat 


' Castresana-Hogan Method: Juan Castresana implemented the Castresana-Hogan 
method before 1990. Juan says that Marge Hogan had the basic idea while working in the 
aircraft industry. 


? History: This chapter started as a sequel for the path-integral paper series in 1989. My 
work here for the book’s 1* edition was performed mostly during 1987-97. 


? Acknowledgement: This is a summary of papers (refs) written while Team Leader of 
the Quant Risk Analytics Group at Bloomberg LP, SSRN, 2016 (refs). 
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these topics in detail’, we restrict ourselves here to some interesting aspects of 
numerical methods motivated by the explicit path integral. 

One advantage of using the explicit path-integral formalism involves the 
elimination of most of the discretization as a grid or lattice in those cases where 
the Green function (1.е. propagator) can be computed in closed form. Then the 
exact solution can be used over those time intervals where free diffusion occurs. 
Such intervals are not supposed to contain any exercise date or other date when 
cash flows need to be computed. This eliminates convergence problems between 
exercise dates (since the exact propagator is used). Further, since the time 
intervals can be chosen arbitrarily in the path integral, the awkwardness of dates 
that are not exactly at grid or lattice points in the standard construction is 
eliminated. 

In pioneering work, Geske and Johnson derived a special case of path- 
integral methods, for multiple-put options. However, the numerical algorithm 
they chose was slow’. 


Path Integrals and Common Numerical Methods 


To start the discussion and for motivation, we consider the connection between 
the path integral, the binomial approximation, and Monte Carlo simulation. 


Path Integrals, MC Simulations, and the Binomial Approximation 

Often a binomial numerical recipe is used for evaluating options. Essentially this 
means that the random numbers generating the paths are replaced by fixed 
numbers allowing only stylized “up” and “down” movements. The binomial 
geometry is defined as having one node containing two outgoing legs “up” and 
“down” as time increases. The binomial lattice is supposed to recombine at each 
time ¢,. The number of points of the binomial lattice in x, at a given f; is then 


1+1, not 2’. Smaller bin sizes, і.е. more points іп x, at f, are therefore 
J р j j 


connected with smaller values of At and thus more time steps. 
See the footnote for a story ^. 


^ Acknowledgements: R. Geske, private communication. 


5 Story: Binning the Paths, Risk Talks, and Business Trips: This connection between 
Monte Carlo simulation and the binomial approximation using bins is an old idea. The 
picture (next page) is from a talk I used to give when I worked for Eurobrokers in 1990. 
Eurobrokers would rent conference rooms in nice hotels in various cities in Europe. 
The talk was on risk management for interest-rate derivatives, attended by analysts and 
traders. The talk was followed by a lot of good food. Then the next day, we would go 
around to various banks in the city to drum up business. It was fun and it even worked. I 
thank Don Marshall for his managerial congeniality and acumen in setting all this up. 
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The Binomial Approximation to a Monte Carlo Simulation 


We now consider how the binomial approximation can be connected to Monte 
Carlo simulation of the path integral for finite time steps and finite sized bins. 
The following figure shows the idea for three steps. 


Equivalent binomial tree (binning the Monte 


Carlo paths from the path integral) 


Equivalent 
binomial leg 


bv 
A 
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The bins in x, can be defined through integrated values of the measure from 


the initial time up to time ¢, such that the binomial model geometry appears. 
Equivalently, we define bins just by counting MC paths (in a large simulation) to 
reproduce binomial probabilities: e.g. (+,2.3,4) at the third step (x,,t,). 
Therefore, for a given MC simulation we can draw both the MC paths and the 
equivalent binomial tree with legs running between the bin centers. We discuss 
the subject in more detail later in the chapter. 


Basic Numerical Procedure using Path Integrals 


The possible paths from the path integral “fan out". As opposed to the binomial 
algorithm, the path-integral discretization is allowed to be non-uniform in the 


spatial variable x(t). This is useful in valuing complex options using importance 


sampling with increased refinement in regions of sensitivity in x(t) . 

In addition, the times can be chosen non-uniformly corresponding, for 
example, to cash-flow payments. For the case of free propagation exhibited 
below, the free propagator G, can be used over finite times. In the general case, 
the diagram still holds but the full path-integral must be used for the propagator. 


The next figure shows the complete discretization for the first two steps in a 
possible numerical approximation using the path integral. 


° Is the Binomial Approximation a “model”? Uh, NO! Often this binomial algorithm 
is called a “model”, as in the salesman asking, “Do you guys use a binomial model to 
price options?” This common appellation is inappropriate. The definition of a model 
should include the assumptions and the parameters, not just the numerical algorithm. 
Calling a numerical recipe as a model leads to misleading statements about model 
accuracy by only focusing on numerical convergence and not on the (possibly much 
greater) uncertainties of the assumptions and parameters. See Ch. 33 on model quality 
assurance. Nonetheless, the language is so common that everybody lapses into it. 
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Path Integral Discretization (Two Steps) 


Each line represents a 


G (х, = = ) dx, propagator 


In the path integral numerical approach, both the spatial 
and the time discretizations may be nonuniform. 


Bins and Path Integrals 
Consider the figure below: 
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Definition of Bins for the Discretized Path Integral 


Bin(b, j) at t, forx, has 


center x ee x height dx, 
Path æ runs through 


bin(b, j) in x, at time f, 


We already discussed binning paths when we talked about the connection 
with the binomial approximation. In general, paths between the starting point x, 


and end point x * run through all allowed values of | for j =1,...,n—1. We 


consider these values to lie inside “bins”. The bins are vertical intervals with zero 
widths and heights {ах {| оп Ше {x » axes, centered around points ix po р . 
Note that dx ; is an ordinary calculus differential, not a time difference. 

The bins’ serve as intervals for a numerical discretization of the ix | 


ordinary integrals. Thus, a value of the variable x, at time f, on a given path a 


7 More on Bins and Numerical Approximations: These bins are important. First, as we 
describe in the next section, all paths passing through a bin b at time t; can be 
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{path a} 


called x, lies inside a bin interval for a specific bin(b, j). namely 


Ў {pan “| e(t - dx, / 2 , к * dx, | 2] 


J 


The path-integral probability measure weight for the set of points {x j \ inside 
bins of size {ах i} is the product of n — 1 measures G(x, x Um )dx ; for 


successive propagation through x,,X,,..,x, , from initial time /, through 


n-l 
intermediate times 4¢,,t,,...,¢ 


„tı multiplied by a final free propagator 
G(x 


eh A -t*) to propagate from time /,, to the final time t*. AII 


possibilities of the values of {x a or equivalently all paths, are then summed 


over by the integrals over all intermediate x, values from —oo to oo. 
The paths themselves can be produced by randomly choosing the values of 
the ixl using the measure in the path integral. This is just Monte Carlo 


simulation. We discuss the relation with path integrals below. 


The Castresana-Hogan Path-Integral Discretization 


We now arrive at the most important section for this chapter. Juan Castresana" 
and Marge Hogan, with brilliant insight, numerically discretized the path integral 
in the 1980's. This discretized path-integral method has been used in production. 

The main innovation here is to use a “nominal interest-rate grid" that 
simplifies the calculations. This nominal grid is mapped into the real interest rate 
grid as determined through the term-structure constraints. 


approximately collapsed to the central point of the bin. Thus, a lattice model can be 
constructed from the Monte-Carlo approximation to the path integral. This approximation 
becomes more accurate for smaller bin sizes. Note also that the bin sizes dx; can depend 
on the bin b for importance sampling using the overall probability measure for a given 
starting point, or equivalently for the density of paths. The bin size can also depend on t; 
which can be important if, for example, we are near a possible exercise time or other time 
of interest in which more or less accuracy is desired. Again, note that dx; is an ordinary 
integration measure here, not a time-differenced coordinate. 


* Acknowledgements: I thank Juan Castresana for much hard and dedicated work 
performing many numerical calculations over the years in two of my quant groups, 
including using his path-integral discretization method. 
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The Castresana-Hogan Uniform Nominal Interest Rate Grid 
The main idea starts with a nominal interest rate grid. This grid is uniform. The 


nominal interest rate at time ¢, takes on values RY ) with В =1...M ап index. 


Grid of Nominal Rates R™) , to be mapped later into 
the Physical Rates 


nnn 


(8) 


The main point is that the value А; at a given В is independent of j , so 


RY )=R In the picture, at every time in the grid, the fourth point up is, for 


example, 5%. Thus, in the drawing, R-* = 5%. This is what is meant by 


"uniformity", namely time-independence. 
Note carefully, however, that the spacing between the nominal rates at fixed 


time t, namely ag (^ ^» = gU — RY) does not need to be constant, as 


indicated in the figure’. This can be used to advantage. The density of points can 


? Notation: Careful. Note that this differential АК?) is the difference between 
neighboring nominal rates at a fixed time, not a change in rates at neighboring times. 
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be increased in regions where more accuracy is desired, for example near a 
decision rate for cash flow determination or option exercise logic. 

The issues for numerical evaluation of complex interest rate derivatives 
involve obtaining numerical efficiency and speed. The term structure constraints 
in addition to any boundary conditions specific to the problem must be satisfied. 

We use the language of interest rates here, although the method is general and 
has been applied to equity products, etc. 


The Reason for the Uniform Grid 


The reason that the grid points are made time-independent is that the transition 
probabilities are defined directly on the nominal grid. Because the nominal grid is 
uniform, if the time interval and volatility are fixed, then the transition 


probabilities po ) between nodes at neighboring times on the grid will be 


independent of time. For that reason the transition probabilities can be calculated 
once and then reused. This results in a considerable saving of computer time. 

If a volatility term structure is used some time savings can still occur. This is 
because the volatility term structure is typically determined in practice by a fewer 
number of constraints (caps, swaptions) than time points in the partition. Hence, 
if a step-function approximation to the volatility term structure is used, several 
intervals in the grid can still wind up using the same volatility. For example, if 
there is a 2-year cap and a 4-year cap used for volatility constraints but no other 
constraint between 2 and 4 years, then the same forward volatility can be used for 
all transitions between 2 and 4 years and the probabilities reused in that interval. 


The Transition Probabilities on the Uniform Grid 


The transition probabilities, as mentioned above, are defined directly on the 
nominal grid. For lognormal dynamics, we get a familiar-looking form for 


pu ) from node R at time t ;. to node RP) at time f ; with volatility с, 


and finite time interval Dt = t, —t, > 0. We lump points between RY and 


RY) together in a bin. Then po ) is given by the difference of two normal 


integrals, 


(44.1) 
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The Number of Points at Fixed Time 


The number M of nominal grid points at fixed time should be regarded in the 
sense of an ordinary integration partition. For example, we can set М = 100. 
There are M transitions on the grid from any given point to the various points at 
the neighboring time. However, considerable time is saved by cutting off the 


number of transitions if the probability of transition pus ) is below a small 


value. 


The Rectangular Grid and “Pruning the Tree” 


The rectangular geometry of the nominal grid is important. It automatically cuts 
off rates that otherwise would become arbitrarily large. This cuts down 
calculation time and avoids unphysical rates. Binomial tree implementations also 
often use such spatial cutoffs, sometimes called “pruning the tree”"’. 


Minimum and Maximum Nominal Rates 


The minimum and maximum nominal rates are set up to cover the range of all 
physically reasonable rates. For example, for the nominal grid for US treasury 


rates we could write A") = 50 bpand А("* = 20%. These are to be regarded 


as minimum and maximum cutoff points for ordinary integrals. If they need to be 
extended (for example dealing with credit products with spread), they can simply 
be redefined. 


The First Time Interval 


The first time interval is a special case because the time to the first cash flow 
from the value date is arbitrary. Although this is not required, often the rest of the 
grid points are taken as spaced equally in time. The grid times are placed at cash- 
flow points. A typical value of the spacing Dt is 6 months, since coupons in 
bonds are often semiannual. However, Df can be any value, and it can depend 
on f,. 


The Physical Interest Rates and the Term Structure Constraints 


The interest rates /R')' in the nominal grid аге NOT the hysical interest rates. 
g p 


The physical interest rates i '} are obtained through a mapping of the nominal 


10 Acknowledgements: I first heard about “pruning the tree” from David Haan at Merrill 
Lynch in 1986. 
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interest rates. The mapping introduces parameters fa i} , one for each time f, in 


the partition", as follows: 
7) =a RU) (442) 


This mapping is a special case of the transformation of interest rates discussed in 
Appendix B in the last chapter. It is set up so that with a lognormal assumption 


the fa j \ parameters are directly related to the drift. However, the mapping can 


be used with any probability dynamical assumption. 


Determination of the Alpha Parameters via Term-Structure Constraints 


The fa a parameters are determined one at a time from the term structure 


Jr 1 
constraints. That is, we evaluate discount factors Рб) = 


j=l (17i? 2 
for various maturities T = J, - Dt using various values of J . Here, the bracket 
( ) indicates the expectation with respect to the probability measure. This 
expectation involves the spatial index Ø. The time between partition points 
Dt =t,-t,, 
Using the mapping Eqn. (44.2) we get 


is typically independent of j. 


Jr 1 


p? - 


jal (1 ъа К?) р) (44.3) 


We then determine the various fa i} parameters from equating these discount 


factors to known zero-coupon bond prices. 


Path Integral Discretization and Stochastic Equations 

We have emphasized repeatedly that the path integral formalism is not only 
consistent with the stochastic equations, but the path integral is explicitly 
constructed using stochastic equations as constraints. In this section we re- 


'' Notation: Please do not confuse the Castresana {aj} parameters with the Monte-Carlo 
path label a or the Laplace transform parameter a, (see definition below). 


684 Quantitative Finance and Risk Management 


emphasize this fact and show how the uniform grid is connected with stochastic 
equations. To do this, we assume lognormal dynamics. Write 


D - Inr = па, 4 In AP (44.4) 


(5) 


(n) (P at 


Consider the transition from some node y; , at time f, , to the node y 


time, —1, , + Dt. Then the time rate of change d, y ) = у? ы y ) over 


: А : (Pl (8*5) (88). 
time interval Df is d, ууу = U; Dt to, йг; | f Dt where dz ; isa 
Gaussian random variable with zero mean and width one. 

Identifying the drift term from the stochastic equation with the drift term 


from Eqn. (44.4) produces H, ,Dt = (а, йы. This establishes the 


connection between the drift and the alpha parameters for the lognormal model. 
This ends the description of the Castresana-Hogan method. 


Path Integrals, MC Simulation, Lattices, and Brownian Bridges 


So far, the path integral may sound different from the paths generated in a cone 
by a binomial algorithm, or other lattice algorithm. However, as the time interval 
vanishes At, = 1, —t;,, — 0 the results are the same. This occurs because paths 


with large x, —x,, for small At are numerically suppressed by the Gaussian 


damping in С. 
Moreover, a direct connection between MC simulations and lattice 
approximations can be constructed. As discussed above, in an approximate sense 


all paths passing through a given bin, bin(b, j ) at time f, , can be collapsed to 


the central point of the bin and forming a point of a lattice. The collection of such 

paths is called a Brownian Bridge. Thus, a lattice model can be constructed from 

the path integral. This approximation becomes increasingly accurate for smaller 

bin sizes. The idea is useful because logic may need to be performed comparing 

different quantities at each intermediate time for complex options. Reducing the 

number of points at which calculations are performed facilitates this task. 
Consider the following picture that illustrates the ideas: 
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Construction of Lattice of Path Bins 


All N71, paths from bin(b', j — 1)at t, in x, , to bin(b, j) at t, in x, 


jJ 


are collapsed to one effective path (a Brownian Bridge) with weight NC ) 
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Monte Carlo Simulation Using Path Integrals - Illustration: 


Monte Carlo simulation can be done in the standard brute-force method simply 
by using the stochastic equations. However, a better idea is to use the binning 
procedure. We generate paths between bin centers, as shown in the figure below: 


Monte-Carlo Simulation from Path Integrals 


1. bin( nj, j) 
d bin(n; 1, j) 


The probability of a path from bin(b "ix 1) at f, , in x, у to a given 


bin(b, J ) at t; in x, is determined by the appropriate propagator. 
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Denote n, as the number of bins at time f р in the х; variable. The bins need 
not have the same heights; this provides useful flexibility. Call А, (or dx) the 
height of the bin (b,j). Then б Pa at -h, (not 


including discounting) is approximately the probability of starting at a point in 
bi (b', j -1) and ending somewhere in bin (b, j). Clearly, we have 


Ca x d ht зї a7 =1. We segment the unit interval into 


n; 


b=1 


partitions with lengths G| x ja sX, ut 


La оа + [ ^. Every bin has a 
unique position corresponding to the integrated probability. Now we generate 


random numbers Lu from the uniform distribution U (0,1). Given a 
particular random number RË , one of the various bins at f, is chosen for Path, 
from bin (b = 1) just corresponding into which segment of the partitioned unit 


interval Ё falls. An example of the idea is in the picture below: 


Random Number Assignment to Bin through 
Integrated Probability, for given x; 


Prob. at bin 
€ (0.5,0.6) 


Applications of the Hybrid Monte-Carlo/Lattice Approach 


One advantage of this hybrid lattice Monte-Carlo method is that standard lattice 
back-chaining methods can be used as appropriate for the problem. Then, Monte 
Carlo simulation can be used following the paths generated as above. In this way, 
Monte-Carlo simulation can be applied to problems that would otherwise be 
intractable. One example is Bermuda or American options, discussed below. 

Another example is index-amortizing swaps (IAS), where the payoffs are 
path dependent. We discussed IAS in Ch. 16. 
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Smart Monte Carlo (SMC) 


“Smart” Monte Carlo or SMC was formulated in finance'*'’ by Dash and Yang", 
and investigated within the path integral framework. SMC can be useful for 
counterparty risk or other simulations. 

A major problem for simulation is that, while it is cheap to generate MC 
paths, it is expensive to call the pricing routines to get the values of deals along 
the paths. For simplicity consider one deal. The idea of SMC is that it is desirable 
to concentrate paths adaptively for which the deal pricer is called, in the region 
where the deal price is most sensitive and/or has the most information. 

The idea is simple. Consider an OTM put stock option C (i.e. with strike 


E << S, for initial price 5, at starting time f,), priced at forward times along 


MC paths [Pani with the paths indexed by Ø. A stock price s" l(t) on 


path Path”) at forward time f likely produces a small option price 
C E Кя because most stock prices on most paths do not contribute much to 


an OTM option. Thus the computer time spent calling the option pricer most of 
the time along most paths in a standard simulator is basically worthless. 


First Generate the Scout Paths, then the Rest of the Paths 


Instead, for SMC, we first generate some “scout paths” randomly in order to 
sample the possible states. We then generate the rest of the paths using directed 
Brownian bridges (cf. above and Ch. 42, App. 2), focused in regions where the 
option prices returned by the scout paths are appreciably different from zero. In 
this way, the problem is avoided that exists for a standard simulator where most 
of the paths have negligible OTM option prices, wasting time calling the pricer 
un-necessarily. 


Interpolation for RN pricers along SMC paths 


We also need interpolation. For example, SMC can be applied with real world 
(RW) dynamics for PFE counterparty risk (cf. Ch. 31). Risk neutral (RN) 
calculations are done first to get security prices at some points at each time. Then 
these RN prices are interpolated at a given time to find the prices on the RW 
paths at each time. This interpolation procedure is approximately equivalent to 


7 Smart Monte Carlo: Related ideas were previously investigated in published scientific 
literature, which I discovered after doing this work (ref). 


? Acknowledgements: I thank Adam Litke for suggesting this problem. I solved it with 
SMC. 
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obtaining RN prices directly using the values of the RW variables along RW 
paths as input to the RN pricers. 


SMC speedup 
In tests, SMC speeds up calculations markedly and also increases accuracy. 


American Monte Carlo (AMC) 


Many have contributed to Monte-Carlo MC evaluation of Bermuda/American 
options, called American Monte Carlo or AMC ". Here we discuss AMC in the 
path integral framework '^". 

Consider a MC evaluation of a European option. A path on which the option 
is exercised at expiration has a cash flow equal to the intrinsic value on the path. 
The sum of such discounted cash flows, divided by the total number of paths, is 
the option value. 

Now consider a Bermuda option. The basic idea for AMC is to use the 
standard back chaining algorithm (cf. Appendix I) in a MC context to determine 
paths on which exercise occurs, and we then to move forward along each path to 
find the first time that exercise occurs on that path. 

An interpolation in the underlying variable(s) is used at each time step. 

The sum of the discounted cash flows (CFs), each CF arising from exercising 
on a specific path, divided by the total number of paths, is the AMC option value. 


" MC Simulation within a MC Simulation" - Just Bin the Paths to see 


Any Monte Carlo simulation, in particular the AMC, can be viewed as a “MC 
simulation within a MC simulation". To see this, we bin the paths — to get the 
idea just look at the two figures above. Going forward in time, all paths arriving 


in bin bin(b', j- 1) at time f, , from previous times are lumped together in 
bin(b',j-1). 
The essential point is that the same paths continuing further in time to all bins 


{bin(b, J |} at time t, can be thought of as paths in a MC simulation emanating 
from bin(b', 7-1). This is exactly what а MC simulation would look like 


starting at bin(b', j- ij; considered as a point. 


^ Acknowledgements: I thank Xipei Yang and Marcelo Piza for discussions on path 
integrals and AMC. 
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The binning procedure within the path integral context thus produces the 
picture for the “MC simulation within a MC simulation” in a simple way. 

The AMC procedure uses the MC simulation within a MC simulation, as we 
describe next. 


AMC Path Integral, Step 1 (Backward Propagation to La before t, ) 


We start at the last date to be considered at the deal maturity, and proceed 
backwards in time. By induction consider the time t before the time t,- We 


need the average value ae “а bu in the bin bin(b', j- 1) at time f, ,. 
J J 

We get this average value by summing over the values C ( d | st | in all 

bins {bin(b, j )} at the later time f, multiplied by the backward Green function 


| for the probability density of transition (also 


j4 J 


including discounting), times the height of the bin h,, viz" 


bin( b' | b, bin b',j-1 bin| b, 
qu Е i] x K E ' hx Tiat t |h, (44.5) 


AMC Path Integral, Step 2 (Interpolation at time t ) 


The weights for the interpolation of C Ge Bas 5 V | at time t , over all bins 


=] 
{bin(b", j -1)) are given by the probabilities of transition from the starting 


point (x t ) going forward to each bin bin(b', j-1). multiplied by the bin 


070 
" " " "e bin(b'.j-1) . " Я 
height h,, . This quantity is just G| x, Xa MET . This procedure is 
the probabilistic realization of the standard MC path generation starting from the 
initial state а to {bin(b', j- )) at time f. 


P Remark: We can also use the more exact result integrating the Green function over the 
bin width, for clarity we omit this. The Green function for simplicity of notation includes 
the discount factor here. 
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AMC Path Integral, Step 3 (Exercise Logic at time Е ) 


(critical) 


Next, we find the critical exercise point Xa 


at time t, and if exercised, 
replace the back-chained values by the intrinsic values. We then continue by 
induction back to bins at time ¢, , from all bins {bin(b' j- )) at time f, ,, 


eventually stopping at the initial state CEA 


AMC Path Integral, Step 4 (Forward Simulation) 


As mentioned above, we then walk forward on each path (defined for path 
integrals through the bin-to-bin transitions) and record, if exercise occurs, the 
first time exercise occurs. Further transitions forward in time from a bin at which 
exercise occurs are ignored (since the deal has exercised in that саѕе)!. 


Multivariate AMC 


Note that the variable can be a vector x. We merely use the appropriate multi- 
dimensional Green function. A complication occurs for FX because the FX drift 
depends on interest rates that may also be simulated (cf. Ch. 5), correlating these 
processes together. 


Difference using random numbers for standard MC and path integral MC 


There is a difference for the use of random numbers between ordinary MC 
simulation of paths and the path-integral MC”. For an ordinary MC simulation, 


random numbers are used for the time difference dx in the stochastic equations 
for the paths moving between successive infinitesimally separated times 
(t LY dt ) For the path integral that uses transition probability functions, the 
random numbers are used to sample each integral at each x(f) in bin size dx, at 
each fixed time f that occurs in the path integral, as in ordinary integration". 


16 Story: In the late 1980’s I examined different classes of paths that went in and out of 
exercise. At that time the backward/forward AMC procedure was not formulated, so it 
was not clear how to treat these paths. 


7 NOTE: dx and dx are completely different: dix measures the difference in x at 
DIFFERENT times (t, t*dt) through the stochastic equations, while dx measures the 
difference in x(t) at the SAME time t as in ordinary calculus for the ordinary calculus x- 
integrals that appear in the path integral. 


5 Story: This brought me back to the first research problem I did in grad school 
evaluating multidimensional integrals using Monte Carlo sampling methods. 
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New Interpolation Methods for SMC and AMC 


For both SMC and AMC, interpolation is essential. For AMC, interpolation is 
part of the standard procedure (cf. Step 2 above). 


Hermite Polynomial Interpolation 
Standard interpolating functions are polynomials. A special case is the Hermite 
polynomials H, (y) y, 


The Gauss transform of the Hermite polynomial is defined as 


сн.) а) ^ f )ew| (а) шш 


—o0 


Hermite polynomials enjoy a reproducing Gauss-transform relation, 
п/2 -1/2 
G| H,(y)]=(1-2u) H, (1-20) 1 0<и<1/2 (44.7) 


The translation of variables is X, =x and х= у. 


Using the Gauss transform, Hermite polynomials at опе time are propagated 
back to an earlier time, still as Hermite polynomials, by the Gaussian transition 
probability. This simplifies the AMC procedure. 


New Interpolations — Cauchy and Prony 
Here are two new ideas: 


e Cauchy interpolating functions 
e Prony interpolating functions 


Cauchy interpolating functions (for any spatial dimension) 


Given points x, in any spatial dimension at which the security C is evaluated 


as te (x,)} at a definite time, we want to evaluate an interpolated value 


' Notation: We omit the vector arrow in multi-dimensional space and the time for 
notational simplicity here. 
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Can (x i. at x * and the same time. To this end, we define a weight w(x*,x,) 
that decreases with increasing distance between x * and х. The weight depends 
on a cutoff parameter к? to prevent it from blowing up at zero distance. 
Specifically given the distance metric d? (5x) = All and designating 
NN (ee) to mean a certain number of nearest neighbor (NN) points"? ix] 
to x *, we define ће Cauchy weight as?! 


|a (25) +K’ [ 


w(x*;x,)- (44.8) 


>, а (rx) HeT 


NN(x*,x,) 


We then get the interpolated value security value as 


NN C У и(х®,х,)С(х,) (44.9) 


NN(x*ix,) 


Path-to-Path Distance Formalism 


It is possible to generalize interpolation to path-dependent options. The basic tool 
is a path-to-path distance between Path œ and Path д. The formalism uses two 
terms, a “potential energy PE distance" and a “kinetic energy KE distance". The 


PE distance i???) 


PE contains path-to-path spatial differences with 
normalization “spring constants” V At time t, We write 


ES a Path B) e)l E và ie a Pu (44.10) 


PE J J 


? Number of nearest neighbors: In practice for x* not too close to any {хь}, around 5 
nearest neighbors gives reasonable numerical results in the cases tested. 


2 Number of NNs in the approach to evaluation point: As x* approaches x, , only one 
nearest neighbor is used, namely хь. Then the weight approaches one and there is only 
one term in the sum over nearest neighbors, giving the correct normalization for the 
interpolated security value in that limit. To prevent jumps, the other terms should be 
turned off smoothly. 
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(Path a ,Path В) 
КЕ 


а x(t) — udt 
odt 


The KE distance d involves same “z-score” stochastic variable n(t ) 


in Ch. 42, viz n(t)= with 44,0 the drift and volatility. Note 77 df 


is dimensionless. With = 20 at time t p 


ЕУ a Path f) (a) _ [ a q^] (44.11) 


The squared PE and KE distances are integrated (summed) over time and 
added to get the square of the total path-to-path distance. Between time ¢, and 


time f, we get 


t 


| dioses A, LT _ i [47 “ер. ven) has 


h 
1—1 


This distance can be used as a weight for interpolations that involve path- 
dependent quantities. 


Prony Interpolating Functions 


We shall look at Prony functions in Ch. 52. For a put stock option with strike Æ, 
the following combination of two Prony functions can be a useful interpolator 
(here 9 is the Heaviside function): 


ту (S)= 9(E-S) Aexp(-A,S)+ 9(S— E) Bexp| -4,(S- Е) | (44.13) 


Put Interp 


Greeks and Path Integrals 


Greeks can be calculated explicitly using path integrals for simple dynamics". 
We consider here the pedagogical case of delta for a European stock option. 
More complicated cases (barrier options, Bermuda options, interest rate 
options...) can also be treated. 


Denote the option price as C (4) today f, with underlying x,, exercising 
at time ¢ with payoff eo and time to exercise f =f —t,. Assume 


constant parameters. Write the transition probability, also including discounting, 
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between (x t ) and (r.c) in dx , as G(x, хи). We have the usual 


00 
expression 


ass j G(x, x.) c(x 0” es (44.14) 


t,)= ae j + G(x, -x 4.) C(x" ax (44.15) 


This 1s an ordinary calculus integral. Notice that the spatial dependence of the 


Green function (in this simple case) is on x, — x . Therefore we can transfer the 


derivative from x, to x using the ordinary calculus identity 


aly, -xt,) (44.16) 


G(s, EEA =— * 


px, 
_ T Q + ж ож * T 
so A(x,.t,)=— | ze -x sty JC(x „t Jax . Next we integrate by parts, 
using the fact that the Green function vanishes rapidly as х — +оо. We obtain 
oC (x : г) 


A(x,.1,) = rf G(x, =x ity) fa’ But Alx’) B EE So, 


А(х.,)= | G(x, x: Ain de" (44.17) 


Eq. (44.17) says that in this simple case, delta A(x,.t,) is the discounted 


expectation of delta (х) at expiration. 


More Complicated Cases 


Now consider the case when we have multiple convolutions of Green functions. 
We simply repeat the above procedure consecutively transferring the derivative 
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through the multiple integral by using the symmetry property of the Green 
function and integration by parts, exactly as above. The following considerations 
hold: 

e Barriers present no difficulty because the Green function [including 
necessary image(s)] vanishes on the barrier. 

e Mean reversion complicates the symmetry. Therefore as the above 
procedure is carried out, extra terms involving mean reversion appear 
and must be tracked. 

e Bermuda options result in classes of paths, as described in Ch. 42, which 
also appear here. 

e More complicated models lead to more complicated results. 


Vega and Path Integrals 


Vega is more complicated than delta. Similar procedures can be adopted in 
simple cases, but the path integral generating functional must also now be 


calculated by introducing auxiliary variables {7 ‚| at times Uu Derivatives 


with respect to these auxiliary variables must be carried out. In practice a 
numerical procedure would be used to approximate these derivatives. 


Normal Integral Approximations 


Here is the venerable single normal integral approximation from Abromowitz- 
Stegun (AS, ref ™) for x > 0. Setting Ту, = 1/(1+ px) with p = 0.2316419, 


up to uniform error with bound le (x) < 7.5x10 ^ , we have 


A 1 2 З k 
NU) ЖОО шас exp(-x у> s (44.18) 
In Eq. (44.18), b, = 0.31938153, b, = —0.356563782, b, =1.781477937, 
b, = —1.821255978, Б, = 1.330274429 . 
For x < 0, use N(-x) -1- N(x). 


Example of use of the Normal Integral - probability for stock diffusion: 


Using the notation of Ch. 42, the probability that the stock price S*=e* at 
future time / * is above a value F is the normal integral N (a ) [or N (4,1 : 
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N(d_)=Prob(S*> E) (44.19) 


The Gaussian probability density at ¢* starting from time zero has a width of 


pru and a height of V ot* . 


Framework for Multivariate Gaussian Integral Approximations 


It is a challenge to derive a uniform approximation to multivariate normal 
(Gaussian) integrals generalizing Eqn. (44.18). The payoff for a reliable and fast 
approximation for multivariate Gaussian integrals is high, since many models 
wind up with these integrals. For bivariate integrals, a variety of useful methods 
exists (cf. Ch. 19 refs.), but even here, none are sufficiently uniform. Different 
methods work best for different regions of the two variables. Few methods exist 
for multivariate Gaussian integrals. 

In the late 1980's, I attempted to use a cluster decomposition framework (Ch. 
49) to handle singular behaviors of multivariate Gaussian integrals at boundaries, 
and then tried to get a uniform approximation to each term in the cluster 
decomposition. This didn't work. However, a completely different idea surfaced, 
described next, in terms of a perturbative expansion. 


Perturbation Expansion for Multivariate Gaussian Integrals 


I developed a quasi-analytic perturbation expansion for multivariate Gaussian 
integrals’. Basically, the big complicated multivariate Gaussian integral is 
broken up into simple integrals. The perturbation expansion, which is exact, is an 
infinite series of lower-dimensional integrals (one-dimensional in the simplest 
approximation). 

The expansion was calculated theoretically to 2nd order. Padé approximants 
are needed for numerical applications. Some numerical work was carried out". 

The idea with some complications applies to multivariate Student-t integrals. 


Application to Correlated Idiosyncratic Risk in Factor Models 


Recently, the above idea was used to motivate a starting point to calculate 
correlated idiosyncratic residuals for factor models *. The usual assumption is 
that the idiosyncratic residuals in the cross-sectional regressions are not 
correlated. However, this leads to defects in the correlation matrix. Correlated 
idiosyncratic residuals can improve the situation in a relatively parsimonious 
fashion. See Ch. 25 for details. 
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Appendix 1: Back-Chaining Bermuda Algorithm, Critical Path 


This appendix gives details of backward diffusion for a bond with an embedded 
Bermuda call, using the path integral Green function framework. We show the 
steps using the full bond P or the “bullet” B (the bond with the call removed). 

If we deal directly with the full bond P , we do not need the critical/Atlantic 
path describing the boundary for option exercise. This is because the bond call 
price is known from the callable bond schedule. If we use the bullet bond B , so 
that the embedded option is considered separately, we do need the critical path. 
This is because the breakup of the callable bond into the bullet and option is not 
specified in advance and must be determined. 

For more information regarding the critical path, see the Quasi-European 
approximate solution below. 


Steps in the Back-chaining Algorithm 
Step 1. Start from last call date is Call Р? the full bond price one second after 


that date; it is the bullet B, with all coupons after t; and evaluated at f; (recall 
bullet bonds contain no embedded options). We need to know the bond price 
РГ? one second before f. Denote the full bond strike price at а by Е, and the 


bullet critical path point at t by b. If pr > Е, or В, > Б, then р = Е, 
and the bond is called. However, if P® < E, or B, < P,, then PO = Р), 
and the bond is not called. We have b — E,. Alternatively, we can write 
PO =min Í РӨ LIE. js 

Step 2: Propagate backwards from time t, to time b using the exact 


: (+) x 
propagator. We get the bond price Р, ; one second after f, , as 


( * * * ж ож * (- ж ож 
Рам 2: corani quer (7 ti )h, 


A, (т) 
bins b(r; ) 


+ Bullet at t; , for all coupons between f; | and f; (44.20) 


Step 3: The bond price Р) one second before f, includes the possibility 
of calling the bond at f, _, With strike price E, ,, where the bullet critical path 
point is Ê. If p >Е, Or By > pus then p — E, ,, and the bond is 
called. However, if p «GE, 4 008, = Ê, then PQ = p , and the bond 


is not called. For low enough rates 5 _| › We know the bond will be called. Thus, 
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к where the bond is not called. Then stepping 


values down for 5a we stop calculating Р) or B, , in step #2 the first time 


we start at large values of 


either inequality p > Е,,огВ, s b, is satisfied. 

Step 4: Continue steps #2 and #3 by recursion, replacing L —1 by 4 and L 
by 4 +1. The recursion is carried out back for each 2 to A =1 in order to 
obtain PO from pr. The recursion is carried out back for each 2 to 4 = 0 


to obtain p from po . 


Comments for the Back-chaining Bermuda Algorithm 


* * * 


Comment for Step #2. The propagator G ( r, sT Pigs. d) is the Green 


function, including discounting. It is exact, with no grid needed for the backward- 
diffusion propagation between exercise dates. For Gaussian and mean-reverting 
Gaussian cases, the propagator can be written down analytically. The second 


term is there to pick up all coupons between b ., and р. This term is part of the 
total bullet bond evaluated at f, , (including all coupons past f, ,); we call this 
B, ,. In fact, p equals В, , less D. (the Ü call evaluated at bu When 
p = E, atthe critical rate, B, = Р, апа Р, | zu SIC. a 
Comment for Step #3. Note that pr = min { Pe} ,E T for the transition. 


This is because the call option at time - Is written on the bond pr. The 
bondholder is short this call option which has intrinsic value 
max des -E T А o}. The bond P?) with this option subtracted is just PO? 


by definition. Note that min LE ; Е, |} = p — max [29 — Ei] Ў 0} І 


American Option Backchaining 


For an American option, the algorithm is applied at each step in the time partition 
because the bond can be called at any time. 


Backchaining for Bonds with Both Puts and Calls 
A simple modification allows the valuation of bonds with puts as well as calls. If 
the option at date f, is a put the < and > symbols are interchanged for the 


determination of popu ; and "min" is replaced by "max". Arbitrary sequences 
of puts and calls can be handled in this manner. 
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Appendix 2: Some Aspects of Numerical Uncertainties 


In this section, we describe some aspects of numerical uncertainties and errors 
with numerical codes. The topic is very large and a full discussion would require 
a large volume. We focus here on a few issues that arise in practice. 


Sociology and Numerical Errors 


The first issue to discuss is sociological. Different people can interpret the word 
"error" very differently. Experts in numerical analysis know that any numerical 
method carries inherent uncertainties that are called errors. On the other hand, 
non-technical people sometimes think of “errors” as human mistakes that occur 
because not enough care was taken, producing “wrong” answers. 

The sensitivity of people to the numerical errors depends on the people and 
the situation. Sometimes, microscopic noise becomes a center of attention to 
nervous people. Other times, real code errors go unmonitored. 


Quantification of Numerical Errors for Pricing 

The second issue is the quantification of numerical errors. There are textbook 
results that give generic guidelines. However, these are often not specific enough. 
Consider a price C generated by a code with numerical error on the order of 


+ AO в. This means that the “true” price roughly is somewhere in the interval 


C sue Є (с У AQ) C + AC 


NumErr ? ~ NumErr 
(c) 
NumErr 
real errors often can only be discovered through painful empirical investigation 
with a number of specific cases. 

Variable time steps help in reducing numerical errors. Small time steps are 
often needed in the short-time region. 

Increasing the number of grid nodes, especially in regions where cash flows 
exist or logical decisions need to be made, can also reduce numerical errors. 


). Unfortunately, in practice the magnitude of 


the numerical error A depends on the grimy details of the problem. The 


Oscillations in the Price as the Number of Nodes Increases 
Often an oscillatory behavior is observed for the price C as a function of the 
number of nodes C (N 


2 des ) in the calculation. For a binomial model, increasing 
the number of nodes means adding more time steps with smaller time interval. 
For the path integral discretization, besides adding more time steps we can also 


increase the number of nodes at a given time. 
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Oscillatory behavior can often be characterized by Laplace transform 


22, xi 


methods” *. The generic example is provided by a contour integral „7 (s; g,b) 


in a complex j plane with 


—bj 


(44.21) 


4(s;g,b)- 1 s Р 2 bj 


2лі  j-o,-ge 


Неге g, b » 0, and а, are parameters, while c is to the right of all singularities 
of the integrand in the complex j plane. The variable s is taken here as a 
monotonically increasing function of the number of nodes in the calculation 
N oa - We are interested in the behavior at large № Le. 5 — oo. Then the 


nodes nodes ? 


integral «Ў is controlled by the leading zero of the denominator, 
D(j) = j-a -g'e" 
complex / plane and pick up this leading pole. For finite values of 5, the 


=0 at j = а„. We move the contour to the left in the 
description is more complicated, and non-leading terms enter. In general, we get 


Í (s; g ,b) = pep? e "* + Non-Leading Terms (44.22) 
R 


as N 


nodes 


In order to make 4 independent of № 


nodes 


— œ, we choose a, = 0. 
This implies o, zog? <0. It is equivalent and convenient to expand the 


integrand in powers of g and perform the integrations term by term. We get 
I (s;g,b)- E Tin(s) -(m+1)b |" sem (44.23) 


Here, M (s) is the largest integer less than —1+ In(s) Jb. Successive terms 


in this series enter as s increases, producing oscillations in „Ў as a function of 
5 . Using the analogy, as the number of nodes increases, price oscillations occur. 


= In(s)/b with b=1, 


g — 0.5 , and an overall normalization of 1000. The curve first overshoots and 


The graph below gives an example. The graph has № 


nodes 


then decreases close to the exact asymptotic value as the number of nodes 
increases up to 50. 


? Oscillation Example: This formalism comes from studies of the effects on high- 
energy total cross sections of successive thresholds for production of increasingly 
massive and different types of particles. 
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Approach to Asymptotic Value as Number of Nodes Increase 


—m— Difference: C(Nnodes) - C(asymptotic) 


In practice, oscillations take a variety of forms. The oscillations can be highly 
damped, or less damped. The results can oscillate around the exact value, or 
about a curve approaching the exact value. All these behaviors have been seen in 
practical examples. 

Of course to characterize these oscillations quantitatively we would need to 
know the Laplace transform of the price as produced by the code. Since this is 
unavailable, we actually need to rely on numerical empirical studies. 


Anomalous Noisy Fluctuations in Option Model Output 


An apparently anomalous fluctuating noise-like behavior from day to day in 
option calculations can result from oscillations, even at a fixed number of nodes. 
First, the oscillations are “non-universal”. This means that the nature of the 
oscillations depends on market parameters. Hence, with different set of market 
parameters, the numerical error changes, even for fixed number of nodes. This 
jumpy behavior, while unpleasant, is unavoidable. 


Risk Management and Numerical Errors 


The third issue for numerical errors is risk management. Risk management 
presents a more difficult situation than pricing. Risk is concerned with 


differences in prices ОС = С ()-¢ under some change of parameters 


dx = x) —x Numerical errors for differences are magnified relative to 
pricing. The relative pricing error А / Crue can be small but the risk error 


© -cO 


— AG  /2 and C? 2 cO AGO р. 


Tru 


can be large. Suppose C 
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Then the code produces dC = óC, +A) The true risk is ôC If 


rue NumErr * True * 
AC ) 


Numer А O( 9€, ) with the same sign, the code risk ФС will be off by a 


factor of two from the true risk, viz ФС =2-6C,,,,. On the other hand, with the 


rue ` 
(c) 
NumErr ? 


opposite sign for A the code risk will be smaller than the true risk. 


The Unobserved P&L 
The unobserved P&L is a monitor. This means that the models are used to 
calculate the code-generated price changes С = С ()-¢% and this is 


compared with the market change óC if that is available. The difference 


P&L =6C 


Market 


Market 


— С gives a handle on the risk errors in the code. 


Unobserved 


Risk Anomalies in Interest-Rate Risk Ladders 


In Ch. 8, 11 we described interest-rate delta ladders used for risk management. 
These ladders are labeled by the changes in rates for discrete and successively 
increasing maturities. The type of rate moved (forward rates, zero-coupon rates, 
swap rates etc.) determines the type of ladder. We emphasized in Ch. 11 that 
moving the swap rates independently can lead to large fluctuations in individual 
forward rates. Because codes work with the forward rates, unusual sensitivities 
can occur if cash flows or decision logic occurs in the region where forward rates 
are moving substantially. 

Some anomalies can occur in the ladders. These can be magnified under 
unusual market conditions. For example, ladder buckets for short maturities can 
exhibit instabilities if an inverted cash curve is present”. 

In Ch. 7 we also discussed the construction of the forward rate curve needed 
as input for pricing interest-rate derivatives. Some discontinuous behaviors across 
certain transition points can result in the output curve, including the futures/swap 
boundary, swap maturity points, and the cash/futures boundary. Sometimes, the 
ladders will exhibit instabilities that can be traced to these discontinuities. 


Appendix 3: Numerical Approximation Methods 


This section describes some approximation methods. Understanding the ideas 
behind the approximations can increase intuition. 


? Inverted Curve: An inverted curve means that a longer-term rate is less than a shorter- 
term rate. Inverted curves are rare but do occur from time to time. 
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A “Call-Filtering” Iterative Method 

The call-filtering iteration is based on the numerical observation that the first 
exercise date after the time under consideration is often the most important one. 
Still, the options associated with the other exercise dates have non-zero value. 
The first approximation in the iteration is to assume that the value of the option 
for the purposes of valuing the critical/Atlantic path at a given time is obtained by 
considering that option as European. Successive corrections to that 
approximation can then be envisioned which take into account the potentially 
multiple-option characteristics. 

The idea works best if the option is in the money. If so, it will probably be 
called at the first possible call date; this is the filtering effect in action. If the 
option is not called at the first call date and stays in the money, it will probably 
be called at the next (second) call date. If the option is out of the money, its value 
is small, and the neglect of the rest of the call schedule is therefore small, though 
perhaps comparable. If the option is at the money, the full complexity of the call 
schedule arises, but even here, the filtering effect is still operative to some extent. 
Iteration becomes more important in this case. 


Quasi-European Approximation for Bermudas — Some Details 


The "quasi-European approximation" for a Bermuda option with a schedule (or 
its limit, the American option), is based on the filtering effect. The filtering 
approximation says either the option has small value or else it will be called soon. 


An option holder at exercise date a needs, in principle, to compare the 
remaining compound option at all future exercise dates with the intrinsic value of 
the option at 7 in order to determine the price Р, above which he/she will 
exercise the option. The idea of filtering is to assume first that the most probable 
option exercise after [; will happen the next time the option сап be exercised, i.e. 
at i 4," To obtain Р, at exercise date [; in this approximation means to evaluate 
the remaining option as if it were a European option (along with a digital option) 
at the next exercise date [; ‚1. This is very fast numerically. 

The second iteration assumes that the determination of the critical path point 
at each exercise date b contains the European option with exercise date b 4, as 


well as the double or "class-2" option with exercise date = This can be 


computed and compared with the results of the quasi-European first iteration. If 
the convergence is sufficient, the iteration stops. Otherwise, "class-3" options are 


added with exercise date b and so on. By definition, this iteration converges to 
the correct exact result. In the best case, the QE approximation suffices. 
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The Critical Atlantic Path in the Quasi-European (QE) Approximation 


This algorithm gives the QE approximation to the critical Atlantic path for a 
Bermuda call embedded in a bond. Coupons can be included as indicated before. 


Step 1: Start from last call date b The critical price at that date P, is just the 
bond strike price E, . 
Step 2: At the next-to-last call date Ü ET 


Calculate the European call С) 4 (Р, , , f; , ) at АЁ 


the bond strike price is Ё,_|. 
with exercise date А 
and strike E,. Iterate P, , Е, = С (Р, |, ,) to get Р, at f 
This is exact. 


ep ^ 


Step 3: By recursion, obtain the approximate QE critical price po at any 
previous call date ü back to А . Get ће European call С"? ү (Р, , б) at É, 


with exercise date pa and strike PEP. We also need the digital option 


(QE) (4+1) К.т Е + (ОЕ) |; м (ОЕ) 
(ea = Ee) Сы (P, , 1.) with strike Pi» since P > E > Era 


© 
* 


We have pe) - E, = 1 p 
йр 


ж Y E 
Ej ) С, ; dP; ‚То get p | 


+1 +1 


Paro) 
we therefore iterate the resulting equation: 


HOP — E, = Cg BOP кл 5) алза) 


Unit Digital 


Shielded “Geometrical Volatility” for Bermudas 

The shielded geometrical volatility gives an intuitive feel for Bermudas. The 
paths for successive possibilities of exercise of a Bermuda option are classified 
into sets, as described in the Ch. 42. The paths corresponding to exercise at a 


given time B in the schedule must remain below the critical path in price space 


for all exercises at times P « f before t, and then cross the critical path 
* * 


during the interval (Ил) in order for this exercise at f; to occur. The 


Exercise) 


probability Р! of this exercise is just the fraction of the number of paths 


crossing the critical path in i 45) (cf. Ch. 4). 


For simplicity, assume that the interest-rate volatility 15 a constant с. We 


Geometrical) 


introduce a “geometrical volatility” ot < © as corresponding to an 


(Exercise) 


approximate volatility that would lead to the set of paths {Paths, | i 


producing the exercise at t. The geometrical appellation is because the paths are 
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(Geometrical) 


constrained by the geometry of the problem. Then, с, can be thought of 


as the volatility shielded and reduced by the previous exercises at De . The 


А : : Н Geometrical : 
idea is to construct a cone of transverse dimension 279°") Jr at time т 


(Exercise) 


from today, such that the desired paths {Paths, \ : 


realized, at 1 std. deviation. The idea is in the picture below: 


are approximately 


Geometrical Volatility. Exercises at later E have smaller 


effective volatilities due to shielding by previous exercises 


Volatility cone for exercise 
* 

at f, when some paths in 

this cone cross the critical 


path between È, and = 


Given the geometrical volatilities, the resulting sets of paths have the appropriate 
constraints of previous exercises. Therefore, each exercise at b can be treated 


(Geometrical) 


approximately as a European option with volatility с, 


Mean-Reverting Gaussian Approximation to Lognormal Model 


This section describes a tractable approximation to a lognormal process in terms 
of an approximately equivalent mean-reverting Gaussian (MRG) process. 

The essential analytic problem with a lognormal model in the interest rate is 
to express the discount factor of products of terms exp[—r(¢) dt] in a tractable 
form in terms of the logarithm of r. This is because then everything can be 
written in terms of the same variables, allowing analytic progress to be made. 

Call y(t) = In [ r(f) Т, ] where T, is a scale (e.g. 1 year). Then we want to 


express the discount factor in terms of y(t) in a simple way. To this end, an 


approximate form for r (t) is found by truncating the sum for exp | y (t) | ; 
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K max 


r(t) T, = ехр[у@) ]=ехр[ yO] У; TDO -y?()]' (44.25) 


k=0 


We choose y^ (t) = In [ r^" (t) T, | as a convenient expansion point. Here 
r^" (f) is the classical interest-rate path. Up to convexity corrections, the 
Gaussian or mean-reverting Gaussian result is r” (1) = f(t) with f(t) the 
forward rate for maturity t. Here we are just concerned with some reasonable 
expansion point; the term structure constraints are implemented later. 

The maximum value of the index k has to be discussed. For stability in the 
discount factor near zero rates, k must be even. We shall stop at the quadratic 


term, but multiply it by a factor, which approximates effects of the rest of the 
sum, determined empirically to give good results. 


The approximation, with r^ (f) = f(t) and A a parameter, is 


: O), pl O 
A rol isaf 22) +570 (44.26) 


This would be used for discounting purposes. Using a Gaussian volatility 
[ d,r(t) |^ =o° dt gives the time difference d,r(f) approximation, 


Е f(t) 1 r(t) 
drugs, = drO з ' Uu B al 
d| c ug ro) c'dr[, 1 
A ЕД7 ч) р : mE 


Here f (t) is the time derivative of the forward rate. 
Because the quadratic term enters with a positive sign in the sum for 


r(t),,,,, regardless of the value of r(t), it acts as a mean-reverting term, 


canceling out the effects of the first term when that term becomes negative. The 
quadratic mean-reverting term tends to imitate the barrier at r = 0 for the 
lognormal process, keeping rates positive. However, numerically it actually 
provides a stronger barrier at zero rates than does the lognormal process. At zero 
rate, the discount factor is one, but the quadratic approximation above produces a 
discount factor equal to zero at zero rate. Thus, the quadratic approximation 
dampens out all paths near zero more than the damping provided by the 
lognormal model. To compensate, the parameter A is chosen as greater than one 
to soften the effects of the quadratic term. 
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We can test the approximation by inserting r(t) and d,r(t) on the right 
hand sides of Eqn. (44.26) and (44.27), and testing the outputs for consistency. 
The problem for the approximation occurs at small values of r(t) where the 
logarithm is singular. As an example, 2~1.4 keeps rates at around 5% 
reasonably consistently. If the starting forward rate is 10% with a slope of 3% for 
30 years and the Gaussian interest rate volatility is с = 100 bp /(yr) ° then the 


approximate r(t) is 4.7% (compared to 5%). The approximate d,r(f) is [4.4% 
- 6.4%]. This is calculated using + с as values for d,r(f) in the right-hand side 


of the formula, compared to [4% - 6%] for the exact + o range. 

Put together with the lognormal Green function (which is Gaussian or normal 
in the logarithm of the interest rate), the above approximation forms an effective 
mean-reverting Gaussian model, where the variable is the logarithm of the 
interest rate. Note that this 1s different from the usual mean-reverting Gaussian 
model where the variable is the interest rate itself. 


An Exponential Interpolation Approximation 


Path integrals have exponentials that are sometimes time consuming to evaluate. 
Here is a robust fast algorithm to approximate interpolated exponentials that 


generalizes others *". Start with a table of exponentials E, = exp ( х, ) at points 
Ena To find exp (¢) at an interpolated point с = 4 х, +(1- 4) х, with 
dx = х,у — х,, a good approximation is 


— E, ) dx (44.28) 


1+1 


exp (С) 84 E,+(1-2) E4- 14(1-4)(Е, 


This approximation is exact at the endpoints and agrees with the quadratic 
approximation to the expansions about the endpoints with small corrections. For 


example if 4 = + and dx=1, the approximation is off only by 0.2% for 
х, € (100,100), and it gets better if dx is smaller. 


Appendix 4: Some other Numerical Methods in the Literature 


This book is not a treatise on numerical methods, and no systematic search of the 
literature has been performed. However, a few innovative methods will be 
mentioned briefly. 
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The Makivic Path-Integral Monte Carlo Approach 


Makivic"" has written a paper regarding an efficient path-integral Monte-Carlo 
simulator for options. He uses the Metropolis algorithm. He has also parallelized 
the code using High-Performance Fortran. 


Moment Methods for Arbitrary Processes 


Jarrow and Rudd"" have set up the formalism for evaluation of options using 
moments, for arbitrary stochastic underlying processes. 


Parametric Analysis of Derivatives Pricing 


Bossaerts and Hillion"" have formulated a method for derivatives pricing in 
incomplete markets, fitting hedge ratios locally and using parameter estimation. 
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45. Path Integrals and Options IV: Multiple 
Factors (Tech. Index 9/10) 


In this chapter, we describe multi-factor path integrals in an arbitrary number of 
internal dimensions" !. This is needed to describe models and correlated risk with 
many factors, multi-factor yield curve models, baskets of equities or FX rates, or 
any other problem containing multiple variables. The figure below gives the idea. 


Multi-dimensional path integral. Internal index а =1...n, 


We call n, the number of variables. The formalism is a straightforward 
generalization of the results? for two variables, say S and P. The variables at 


' History: The work in this chapter was reported in my SIAM talk in 1993 (Ref i). The 
two-dimensional case with n,=2 was in my 1988 path-integral paper. Path-integral 
calculations of multivariate yield-curve mean-reverting Gaussian models began in 1988. 


2 Background: This chapter assumes you have read the preceding chapters on path 
integrals and interest rate options, or else you know something about path integrals. 
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time ¢, are denoted as {x7} with œ =1...n,, or with another Greek letter Д, у 
as required by indexing. You have to think of each vertical fiber in the figure as 


being an л, -dimensional plane, in which the set of variables {xz} is a point, all 
at fixed t,. The case n, = 2 was illustrated in Fig. 7, Ch. 42. 
The {x} variables will be assumed for simplicity to be Gaussian. As 


explained in the pedagogical Ch. 42 on path integrals, x; can itself be a function, 


so lognormal or other non-Gaussian processes, mean reversion, etc. can be 
included. 


We begin with the discretized matrix stochastic equation for x Е 
п, 
а.а а а а ap В 
dx; = хі х; =u dtt+o; DR; 7; dt (45.1) 
-l 


Here Ld is the (ap) matrix element of the square root of the correlation matrix 


T 
p, (Ch. 28, 48) and also of (Ор) (Ch. 22). We have included the possibility 


of time dependence in the correlations. Each п? is a Gaussian random variable. 


We have the matrix equation in the internal indices for each f, : 
р, =R, RF (45.2) 


We also have the discretized Gaussian orthogonality conditions between the 


various il , namely 
(m m) 0,5, [dt (45.3) 


The probability measure for the Gaussian variables UA is derived from 


z- Wii er zoe ШШ] (45.4) 


а=! j —o0 


In this expression, we have suppressed the time indexing for simplicity, as in 
the chapters treating one-variable path integral discretization. We need to fix the 
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last j= N —1 variable 7; , by the constraint that we specify x; in the Green 
function. Alternatively, we just remove the last 1 аху integral once we introduce 


the constraints’. Just as in the one-dimension path integral formalism, we use the 
stochastic equations as constraints in order to change variables from the iil 


variables to the a variables. So we write the Dirac delta function identity: 
l= fmi 4,9 | хы -x - wi dt (п) 4] (45.5) 


á n, 
Here, we introduce the matrix notation (R,-7,) = У R^ m^. Now we have 
d 7], J 1; 
fi 


the delta function matrix identity for any matrix c, , 


oleen) e ПУ ny] en 


Here, we used det R ‚= det p j . Also, R ^ is the inverse matrix. 
The path integral is then obtained exactly as in the one-dimensional case by 


letting the delta functions kill the integrals over the fn; \ , replacing them by the 
TD variables. We denote by Gat bd ТЕЧ the resulting path-integral 


a 


a 


Green function from initial time ¢, with variables ix \ to final time ¢, with 


variables \ , without discounting yet. We obtain (restoring the time indexing), 


b 


b-2 ny 9 


ор) e ас Й gr 


b- 
П ехр x LL ES х? иа (oj y ET =x = 2 "а | 
ј=а P 


В,у=1 


(45.7) 


? Last variable: This has to be done for each a of course. I thank Andrew Kavalov for 
helping to clarify this point. 
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We also need to introduce the discount factor, which we left off this 
expression because we have not specified which of the variables correspond to 
the variables in the discount factor. 


Note that if the drifts i | depend on spatial variables, the integrals become 


complicated. This occurs for FX where drifts depend on interest rates. See Ch. 5. 


Calculating Options with Multidimensional Path Integrals 


Aside from the scary looking Greek alphabet soup, there is really no difference 
here from what we had in the one-dimensional case. Before calculating anything, 
the discount factors have to be included. Exactly as in one dimension, the 


integrations II fo over the final Pd \ variables at final time ¢, have to be 
reinstated. This is in order to calculate discounted expected values of payoffs 
C [ КЧ specified at time /,. In general, we may calculate expected 


discounted cash flows by inserting the cash flows at the appropriate time points. 

Analytic results can be obtained as usual, when it is possible to get them, by 
completing the square, and then performing ordinary calculus integrals. 

Boundary conditions can be introduced by restricting appropriately the 
integration limits over the various Ie In this way, barrier options in several 
dimensions can be formulated. In Ch. 19, we discussed the calculation of two- 
dimensional “hybrid” barrier options. 

Numerical methods can be formulated as usual. Monte Carlo simulation 
methods evaluate the multivariable path integral just as in one dimension. Path 


integral discretization along the {x7} fibers for each time can be done using 


standard numerical integration techniques. Other discretization methods can also 
be used. 

For only one time step, once the integration of the final variables is 
reinstated, there is a single multidimensional integration. For example, a single 
time step Monte-Carlo simulation is done for VAR calculations. In Ch. 27, we 
described Enhanced Stressed VAR. The correlation matrix is consistently 


stressed. The volatilities are taken as fat-tailed vols. The final states la \ are 


characterized statistically and risk measures are used to assess individual risks of 
the various variables (the stressed CVARs) and the total risk (Stressed VAR) of 
all correlated variables. More ambitious calculations with several time steps can 
also be done, and the risk is expressed statistically. 
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Principal-Component Path Integrals 


One application deserves special mention. We can choose the variables m to 


be principal components". We can reformulate calculations using path integrals 
with the principal components as the variables. 

Because we want to use œ to designate a principal component, we need to 
use another index, say т —1...M , to designate underlying physical variables 
|| ‚ of which a linear combination gives х? at fixed /,. We need some data 
set in order to define the principal components (or some other ansatz). Therefore, 
write Fy = (ro, — 1") 

Data Set 
by the quadratic form 


, and define the positive definite variance matrix R 


тт' — 


1 N 1 N N 
= ЕЕ, О. (45.8) 
j=l 


We look for the R -eigenfunctions” La with components v 


m 


js and 


eigenvalues ul , viz 


M 
У BR reap: (45.9) 


т'=1 


The eigenfunctions give linear components for the principal component variables, 


M 
ay Че (45.10) 
т=1 
For example, the о = Flex eigenfunction Ч = 3 (1,-2.1) for rates 
with maturities {m} = (2 yr, 5 yr, 10 yr) produces the flex principal component 
a-Flx _ | [,2yr 5yr 10 yr 
: = (r 2r; +r | 


* Principal Components: See Ch. 48 for a more detailed discussion and some history. 
There, the variables 1," are interest rates of various maturities at different times. We 
might want to renormalize the eigenfunctions by dividing out the r-volatility, as 1s done in 
the flex example below Eqn. 45.10. 


5 Actually these “eigenfunctions” should properly be called “eigenvectors”. 
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We can specify stochastic equations applied to the principal components, and 
then invert Eqn. (45.10) to get information for the physical variables { rr} А 
Alternatively, starting with the path integral starting using the physical variables 
frr) gives the dynamics of the principal components ixi] : 


In any case, the eigenfunctions and eigenvalues defining the principal 
components are assumed fixed in the calculations. 


Multi-dimensional Monte Carlo Simulation 


As explained in Ch. 44, path integrals can be used for Monte Carlo simulation. 
The discussion there was for one space dimension. This generalizes to multiple 
dimensions, and bins become hyper rectangles. 

Essentially the multidimensional path integral is performed by Monte Carlo 
simulation in the ordinary sense of Monte Carlo evaluation of a multiple integral 
as a numerical approximation. That is, using Gaussian random numbers, Monte 


Carlo choices x” are made for the values of the spatial variable 
J |RandomNumbers fe} 


x and the integral is calculated as an average in the usual way. 
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46. The Reggeon Field Theory, Markets in Crises, 
and “Predicting” Crises (Tech. Index 10/10) 


Introduction to the Reggeon Field Theory (RFT) 


This chapter outlines what I believed for many years' would be a fruitful area of 
research for mathematical finance using the theory of non-linear diffusion called 
the Reggeon Field Theory (RFT) '. The basic idea of using the RFT is that the 
RFT reduces to the familiar standard diffusion when the RFT nonlinearity is 
turned off, and so the RFT is a natural generalization of standard diffusion used 
in finance. One potential use for the RFT lies in the calculation of fat tails in a 
more fundamental fashion than a phenomenological description using fat-tail 
volatilities and ordinary diffusion. 

The RFT produces critical exponents. Similar exponents are now known in 
finance as Hurst exponents " 

The RFT produces not only scaling exponents but also the scaling laws for 
the Green function. When the nonlinearities are set to zero, the RFT reduces to 


the free-diffusion-/time scaling, and produces a Gaussian for the Green 
function, thus reproducing standard financial theory. The nonlinearities in the 
RFT are non-trivial and modify the standard theory. 

The critical exponents and scaling Green functions can be calculated ab- 
initio in the RFT given some starting assumptions. The starting assumptions 
involve the types of nonlinearities in the nonlinear diffusion, along with technical 
points regarding a physics construct called the renormalization group. The 
calculations are difficult, sometimes involving a hundred pages of dense 
calculations. 

Recently, results were obtained for markets in crisis, and also for 
probabilities for anticipating crises, by Dash and Yang (ref. "). 


' History of applying the RFT to Finance: I worked on the RFT as a high-energy 
theoretical physicist, calculating scaling exponents and Green functions, including 
assessing relevance to experimental physics data. Much later, I used to run a quantitative 
Options Seminar at Merrill. At the first meeting in 1987, I said that the free diffusion in 
the Black-Scholes model was probably too simple, more general scaling laws could be 
applicable, and mentioned the RFT as a promising candidate. For many years I never had 
the time to do the promising calculations reported in this 2" edition. 
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So this chapter is really intended to introduce the RFT and leave it as a big 
exercise for some ambitious readers to do more work’. 


Chaos 


Chaos theory”, vigorously promoted by B. Mandelbrot, W. Brock, B. Savit, and 
others, also has critical exponents. The RFT has similarities to chaos theory, but 
the two theories are not identical’. In particular, the RFT is unique in being a 
natural extension of standard finance diffusion modeling, as we will see. 


Summary of the RFT in Physics 


This section contains a summary of the Reggeon Field Theory convenient for our 
purposes here’. The RFT aims to describe high-energy diffractive scattering. In 
the language of elementary-particle physics, the RFT is a theory for the Pomeron. 
The RFT without any interactions is trivial free-field theory equivalent to free 
diffusion. The interactions generate nonlinearities and non-linear diffusion. 

It is important to note that the nonlinearities of the RFT have nothing to do 
with other “nonlinear” ideas, e.g. a nonlinear dependence of volatility on the 
underlying variable or a nonlinear transformation of the underlying variable. 
Rather, nonlinearity here refers to a nonlinear dependence on the Green function 
itself in the relevant equations. In fact, different Green functions exist, which are 
coupled in a nonlinear fashion. 

The RFT was motivated by perturbation theory in the interactions. The most 
common interaction assumed is the imaginary triple-Pomeron PPP coupling, 


commonly denoted as ir with i=~/—1, shown below: 


? Acknowledgements: І thank Xipei Yang for implementing the numerical comparison 
of crises data to the RFT model. I also thank Andrew Kavalov for discussions. Kavalov 
performed some RFT calculations for finance using the nonlinear quartic y? у 
interaction at dimension D - 1 directly in (x,t) space. 


? Acknowledgements: I thank Bob Savit for informative discussions on chaos techniques 
and their possible relationship to finance. 


^ Chaos Story: Around 1988 I started a program involving the investigation of chaos 
time-series techniques. After a promising start, the management asked me if I would bet 
my bonus on the success of the program. I said no. The program was canceled. 


5 Reader Background Assumed: In this chapter, no punches are pulled since there is no 
space to provide the background. The reader is assumed to be familiar with some aspects 
of scattering theory, the renormalization group, field theory, second-order phase 
transitions, critical exponents, dimensional regularization, the Wilson ғ-ехрапѕіоп, 
scaling behavior, irrelevant variables etc. It would also be helpful to know something 
about the Reggeon Field Theory itself. 
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RFT Nonlinear Diffusion: Triple 


Coupling Interaction 


The spatial variables in the RFT in physics correspond to the transverse 
dimensions perpendicular to the incoming beam of particles in accelerator 
experiments. The time variable in the RFT is a theoretical construct. It is Fourier- 
conjugate to an "energy" variable that is really E =1— j where j is a “cross- 
channel" angular momentum. The RFT is of interest in the infrared E ~0 or 
j #1 region, which governs high-energy, low momentum transfer scattering. 

The RFT may be relevant for describing high-energy diffraction scattering, 
provided that the energies are well above thresholds corresponding to finite- 
energy heavy-particle production effects. 


The RFT Lagrangian 


The RFT non-linear diffusion with the PPP coupling has the nonlinear 
Lagrangian L: 


l4. 8 " 1. А 
ж тА he. a'h V Yy Vay -in(v y^ hc.) (46.1) 


Bold type is being used for vector notation. The field y (x.t) Is a theoretical 


construct that depends on the spatial variable x and the “time” t, while y^ is 
its hermitian conjugate (h.c.) The “bare” PPP coupling is denoted as 1. 
Correspondingly, the interaction Lagrangian is seen to involve the nonlinear 
product of three fields. The “bare slope" «', is relevant to the first-order 
description of the high-energy scattering away from the forward direction. 
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Non-interacting Free Diffusion (Brownian Motion) Limit 


If the interaction is set to zero, the Lagrangian just has a free form. The 
corresponding dynamical equation is the same as the free diffusion equation, but 


with imaginary time. The Green function is just a Gaussian with free-field 4/ Af 
scaling. Up to normalization it reads: 


Сы (Ax, M) = Nexp| i (Ax) (46.2) 
, 7 = exp | I —— " 
free p 4 B AL 


The interactions change the form of the free Green function and also the 
scaling behavior. Feynman rules for diagram calculations exist that summarize 
perturbation theory and are consistent with unitarity. However the applications of 
interest here are non-perturbative. That is, the results for the Green functions of 
the RFT are of infinite order in perturbation theory. 


Calculations in the RFT 


The non-perturbative RFT calculations proceed using renormalization group 
equations. The critical dimension D 


и> In Which the coupling constant becomes 
dimensionless, is relevant for the possible occurrence of a second-order phase 
transition. Such a second-order phase transition is associated with scaling 
behavior and an infinite correlation length. However, the critical dimension is not 


the dimension D,,. of physical interest. With a PPP coupling, the critical 


dimension is О, =4. However, the physical dimension for high-energy 


crit 


scattering is О, —2, corresponding to the two real transverse dimensions 


phys 
perpendicular to the beam direction. We need to calculate for arbitrary dimension 
D, ie. at nonzero Wilson variable ¢€=D,,—D. At the end, 
© phys = D -D 

In order to test for the existence of a phase transition, it is necessary to 
calculate the Gell-Mann-Low (GML) £ -function. This function is calculated in 


perturbation theory“. If the £ -function has a zero at a non-zero value of the 


phys 18 fixed. 


coupling constant, a phase transition can occur. Results are then obtained for the 
critical exponents as a power series in £. Scaling behavior for the propagator 
(the lowest-order Green function) can be calculated depending on € and the 
critical exponents. This is done by explicitly solving the renormalization-group 


* Consistency: There is no contradiction between the perturbative expansion of the GML 
function and the nonperturbative aspects of the final results for the Green function. 
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equation for the propagator. The vertex function Green function, generalizing the 
simple bare interaction pictured above, must also be calculated. 

If the PPP coupling is present, higher-order couplings are irrelevant variables 
and do not change the scaling behavior (cf. Bardeen et. al., ref). However, if the 
PPP coupling is not present and irreducible higher-order primitive interactions 
occur, the RFT leads to different results. 


Results of the RFT in Physics 


The most complete discussion of the results assuming a PPP interaction are given 
in Baig, Bartels and Dash (ref). They are summarized as follows. The imaginary 


part of the amplitude, Im 7;, for elastic scattering ab — ab at Mandelstam 
variables 5, ¢ that results from the calculation is’ 


im, (4) =A, (A sns)" T 2 Ons)" | вз 


2 


Неге у and ¢/a' are the critical exponents. They were calculated by 


Abarbanel and Bronzan in one loop to O(e), and extended independently by 


Baker and by Bronzan and Dash in two loops to О(2?). The calculations аге 


straightforward though long and complicated. They do not involve arbitrary 
parameters or regression fitting. The results are given by 


2 
yz E " E 161 4 | 37 (46.4) 
12 412 12 3 24 
2 
E E 59 4 79 
— '=— + In—+ 46.5 
c/a 24 Ө BEL 24 SER 


(11) 


In the real world, for physics, = = 2. The quantity T is the Green- 


function propagator. The form of the RFT result for T CD) is not easy to describe. 
For a complete discussion including the definitions of all the symbols in Eqn. 
(46.3), the interested reader is referred to Baig, Bartels and Dash (ref). Evaluation 


of TU? must be done numerically. 


7 Notation: Mandelstam variables s, t: These variables were introduced by S. 
Mandelstam and describe the kinematics of elastic scattering. For high-energy diffractive 
scattering, we are interested in large s and small t. Unfortunately, by convention, the same 
letter t is also used for the RFT “time”, which is really In(s). See below for the translation 
into finance variables. 


722 Quantitative Finance and Risk Management 


If the PPP coupling is set to zero, the critical exponents are simply zero. The 


propagator T ©) reduces to the non-interacting free-diffusion trivial result, both 
for its functional form and for the form of the argument. 


RFT Scaling Expressed in More Standard Language 
Here is the proposed translation into more recognizable language. The translation 


to finance variables from Eqn. (46.3) is Ins — (Ar), t—1/(Ax)’. Hence, 


t(Ins) ^^ > (л) / (Ax)’ , 50 the relevant scaling variable is of the form 


JG) / Са The anomalous dimension, i.e. the difference in the 


power from square-root scaling, is therefore given by —0.5 ¢/a', which is 


actually a positive number. The total power and the anomalous dimension for 
different dimensions are given in the table below. 


Power of At for Different Dimensions 
Given by the Reggeon Field Theory 


Dimension |Total Power|Anomalous 


|t 
[ | os 097 
[ 3| 053 008 
[ 4| «50 0.90 


The equivalent statement for the variance is (Ax) oc (м IRE For 
example, at dimension D =1 we get (Ax) oc (Ar) instead of the Brownian 


or Gaussian free-field result (Ax) oc (Ar). Therefore, there is extra variance 


coming from the interactions in the RFT. 
The other critical exponent у gives a violation of scaling. For our purposes 


here, this is less interesting than the scaling property. 
We note that the RFT calculations above only makes sense for D < 4. 


What should we remember about all this? 
The two important things to retain at this point are: 


e Definite expressions are calculable in the presence of nonlinear diffusion. 
e The limit as the interaction vanishes is just the free diffusion. 
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Directed Percolation and the RFT 


Cardy and Sugar" showed that directed percolation and the Reggeon Field Theory 
are in the same universality class. Directed percolation is a real random walk with 
branching, recombination, and absorption. This may be a fruitful observation for 
further work in nonlinear finance. 


Aspects of Applications of the RFT to Finance 


The starting point of the potential application of the RFT to finance is to see 
which assumptions make sense, translate the variables, etc. 

Without interactions, the RFT reduces to standard model assumptions in 
finance, corresponding to free diffusion. With interactions, changes from the 
standard models will occur. 

The main idea, of course, is that critical exponents will emerge to describe 
deviations from “square-root time” scaling characteristic of the standard 
Brownian assumption used in finance. That is, we expect to get functions of 


Al (Ах) | (Ar)? , where v gives the deviation from «fime due to nonlinear 


interactions. 


Mapping of the RFT to Finance 

We want the mapping to give the usual finance results when the interaction is 
turned off. The mapping could be as follows. The spatial variable x maps to the 
underlying variable under consideration. For example, for single-stock diffusion 
we would just take x=InS to correspond to lognormal diffusion in the zero 
interaction case. In this case the physical dimension corresponds to a single 
variable, so 2, —1. 


The simple procedure, mirroring tradition in finance, is that the physics 
results are just taken over directly. For the RFT uses the above mapping’. 


phys 


Jumps and Fat Tails 


The distribution in movements of the underlying variables is critical for risk 
management, as emphasized many times in this book. The nonlinear interactions 


* Potential Direct Calculations using the RFT in Finance: It would be assumed that the 
RFT Feynman rules apply to the calculations in finance. The RFT time (with the factor i 
— sqrt(-1) removed) would correspond to real time. The analog triple PPP interaction 
term would be taken as being real, not imaginary. Other interactions (quartic...) could be 
envisioned. Calculations involving the renormalization group and the Wilson €- 
expansion would be carried out. I have no intention of doing this and you shouldn’t 
either. 
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may generate interesting fat-tail jump events. If the RFT in physics is any guide, 
extra variance like that shown in the table may result, producing fat-tail events. 


Lessons from Critical Exponents (1° Edition) 


Numerical values of theoretical critical exponents are probably useful only in a 
limited sense. First, critical exponents from data vary by underlying, and also 
vary according to the data time window. Second, critical nonlinear theories are 
probably restricted to only a subset of behavior over time, so the notion of a 
critical exponent for an entire time series is only a phenomenological 
approximation. Third, the notion of the RFT dimension is only valid for D<4 
(€ = 0), so association of an index would need to be made as a composite object. 


The RFT and Describing Financial Crises (2"* Edition) 


Results confirming and extending the above conjectures were obtained by Dash 
and Yang (ref) since the 1“ edition. There are two types of results”: 
1. Discovery of approximate RFT scaling averaged across various markets 
of different asset classes, which are already in crisis. 


2. A methodology for the probability of predicting crises in equity markets 


RFT Scaling Without Fitting for Various Markets In Crisis 


Evidence of RFT-like scaling on the average, for markets when already in crisis 
was discovered '°. The scaling is, quite surprisingly, in rough numerical accord 
with the RFT two-loop result for the extra part of the variance exponent, 


Ekrr = -6/@'~ 0.27. The extra part is the deviation from the usual Brownian 


? Other Critical Exponents: The Bloomberg, L.P. system has KAOS exponents, and can 
calculate them for various time series of different lengths. There is also the Hurst 
exponent. However the KAOS and Hurst exponents are different from the critical 
exponent we use. The main differences are that for our in-crisis results, (1) we restrict the 
time to crisis periods and (2) we only look at big moves within the crisis periods. 


10 Definition of “Crisis” and “Scaling”: A “crisis” was defined using a formulaic recipe 
invented by Adam Litke (private communication). Scaling was evaluated between one 
week (to eliminate rebound effects) and one month (limited by the durations of crises). 
Only “big” moves were used within crises (confidence levels 70% to 95%). Trends were 
not extracted. See Dash-Yang for details (ref). 
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exponent for variance, equal to one. The markets examined were equities, FX, 
commodities, rates, and distressed bonds". 


No Parameter Fitting 


Again, there was no parameter fitting for the exponent, which had been 
previously calculated and published. Do you know of another calculation done in 
advance, since Bachelier / Brownian Motion, that actually describes finance data, 
without fitting any parameters? 


Rich-Cheap Analysis, the RFT, and Crises 


The RFT can serve as a language to describe the changes of scaling behavior in 
crises from Brownian behavior. The RFT critical behavior can be used as a 
benchmark for rich/cheap analysis of markets in crisis. Those time series with 
critical exponent above the RFT extra-variance exponent 0.27 can be called 
"rich" and those with critical exponent below the RFT extra-variance exponent 
0.27 can be called “cheap”. 


RFT - the Natural Extension to Brownian Motion 


Note again that the RFT, the general non-linear theory of diffusion, reduces to the 
usual finance assumption of Brownian motion and ordinary diffusion when the 
RFT interactions are zero. The RFT provides a natural language to discuss non- 
Brownian dynamics, and is the natural extension to Brownian motion. 


" Numerical RFT-Like Scaling Results for Markets in Crisis: The results for the 
change in the variance exponent giving the deviation from Brownian motion, averaged 
across markets and across crises in the markets, depends on the variable used to measure 
the scaling. The results were 0.35 using linear returns (0.35 includes the kinematic factor 
of 2, going from linear returns to quadratic variance returns). The result 0.17 was 
obtained using squared returns. The empirical interval (0.17, 0.35) brackets the RFT 
calculated value 0.27 involving no free parameters. There is nothing in the theory or 
mapping to indicate whether to choose analysis based on linear returns or quadratic 
returns. There is some preference for linear returns from detailed examination of returns 
in the 1987 crisis. 

Non-universal variations of the exponent for individual crises of specific markets are 
observed around this average RFT-like scaling behavior. Taking the averages over crises 
and over markets seems to “suppress irrelevant variables” (see ref). 

Altogether there were around 150 time series that were included in the analysis, 
going back as far as possible in time. There were around 200 cases (including movements 


in both directions for an FX rate), where each case had one or more crises. For example, 
the S&P 500 had 12 crisis periods between 1929 and 2011. 
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Predicting Crises in Equity Markets; an Earthquake Analogy 


A methodology for predicting crises in equity markets" was developed using 
scaling language. The method worked quite well in backtesting, much better than 
chance, for various equity markets". This method involves a complex noise 
filter. One indicator was the cumulative buildup of the anomalous variance 
exponent v during non-crisis periods. The non-crisis exponent V is generally 
smaller than the crisis-period exponent that we called =... 

In particular this predictive model has substantial memory effects. This is 
physically reasonable given “bubble-like buildups" that occur during rather long 
times before crises. The analogy is that "stress" builds up and then is released 
during a crisis, rather like some types of earthquakes "^. 

This framework for detecting crises in advance is very different from 
standard zero-memory Brownian motion finance models. 


Details of the Crisis Prediction Methodology 


— (Indicator) 


The idea is to postulate various indicators x (t) as functions of time”. 


Then the probability of crisis p(t) is calculated, occurring within fixed time 


AT before a given time f, for a given time series. Here, AT = 1 yr was used. 


The crisis probability calculation uses a logistic fit during in-sample periods, 
with known crises. “S-shaped curves" are fit to step functions to determine “beta” 


parameters B (independent of time). The step functions are crisis flags; 
y(t) =0 or y(t) =1 ifa crisis doesn’t or does occur within time AT before f. 


The idea is to vary the betas to get the probabilities of crisis to match the 
crisis flags during the in-sample test period (where crises are known). Then out- 
of-sample predictions are made with the same beta parameters. 


? What about predicting crises in markets besides equity? We had no time to do the 
analysis. But the results for equity markets are significant. Again note that the results for 
markets already IN crisis were for various markets, not just equities. 


> Contagion: Some examination of contagion between different equity markets was 
carried out. 


^ Earthquake Analogy: We found out - after thinking of this analogy - that others had 
made the analogy earlier. 


? Indicators and Scoring System: These are the heart of the method. One especially 
useful indicator was the cumulative sum of positive exponents measured since the last 
previous crisis. Indicators were reset to zero after each crisis. 40 indicator candidates 
were examined; a subset was used. An empirical scoring system was constructed, 
including type I and type II errors. Some smoothing schemes were employed. 
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We discretize time as {t}. Define x, = x"? (f) as the vector of 
indicators at time ¢,. The probability p, of a crisis within AT before some time 


t, is assumed to be the logistic form with beta parameters B , VIZ: 
p (B) - exo(s.- £)/| 1+exp(¥,-A) | (46.6) 


The crisis flag at £, is called у, = y(t). If there is a crisis at time ¢,, then 


the crisis flag has value y, =1. If there is no crisis at ¢,, then у, = 0. For in- 


sample testing we know where the crises are, so we know the crisis flag values. 
The likelihood function is defined as L — П L, with 


1— y, 


L(8)- p" (1- p)” (46.7) 
The likelihood is maximized to determine the betas. 


Intuition for the procedure to determine probability of crisis 
We want a high crisis probability р, «1 at (actually just before) a crisis when 
y; =1, and we want a small crisis probability р, = 0 when there is no crisis 
when y, = 0. So basically we want to get p, = у, as far as possible. 

The simplest case has only one time ¢,. If y; 21, then L, = р, which is 
maximized by р, =1. If y,=0, then L,—-l- p, which is maximized by 
р, ^0. In either case, p, = у, as desired. This is the intuition. 


In reality there are many times; the basic idea is to get the probabilities to 
match the various crisis flags as far as possible by varying the betas. Note that the 
betas are independent of time, and a given time series can have many crises, so 
this is a non-trivial exercise. 

If the indicators are increasing functions of time, the crisis probability also 
increases with time - if successful getting near one when getting near a crisis. For 
positive betas, all Ø > 0, we see from Eq. (46.6) that if an indicator x, — +оо 


then the probability of crisis becomes high at time ¢,, i.e. р, — 1. So we look for 
indicators that get large in the vicinity of crises. 
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Finance Theory is Really Phenomenology 


Just assuming physics equations or results (as done in this chapter) parallels the 
way finance has used physics for over 100 years. As emphasized many times in 
this book, finance is not physics (even though finance uses physics ideas) and 
finance is not mathematics (even though finance uses mathematics). Essentially, 
the idea is just to postulate a mapping, calculate the consequences, fit parameters 
of the model to financial data, and see how the results work in practice’®. 

Finance is actually just phenomenology, in the language of physics. 
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47. The Macro-Micro Model and Trend Risk: 
Overview (Tech. Index 4/10) 


Explicit Time Scales Separating Dynamical Regions 


The Macro-Micro model was developed in a systematic program in the late 
1980’s. The first goal was to model phenomenologically the real-world statistical 
behavior of yield curves using multifactor models. The idea of different dynamics 
in different time regimes is implied by real-world data. The second goal was to 
be able to provide a framework to price contingent claims, including the different 
time regime dynamics. The model presented here is a real-world model that 
approximately satisfies no-arbitrage constraints. It is useful for risk management, 
that requires (or should require) information on different time scales. 

The Macro-Micro model explicitly includes time scales. Standard model 
assumptions of the movements of financial variables do not include time scales 
that separate different dynamical regions. On the other hand, the behavior of 
these variables in the real world clearly exhibits a variety of time scales. Interest 
rates, FX rates, stock prices etc. behave very differently at long time scales (e.g. 
months) relative to their behavior at short time scales (e.g. days). However, the 
absence of time scales in standard models makes the description of these different 
behaviors at different time scales difficult or perhaps impossible to understand. 

It is true that mean reversion parameters with units l/time can be used. We 
make extensive use of mean reversion, in fact very strong mean reversion. 
However, the yield-curve data coupled with the pricing of derivatives require 
more than mean reversion. 

The Macro-Micro model is quite intuitive. The Macro component with long 
time scales is associated with macroeconomic behavior, providing a connection 
to economics. The Micro component with short time scales is connected with 
trading. Although first formulated for interest rates, it now appears that the idea is 
more general and may apply to the FX and equity markets as well. 

There are three parts to this chapter. The first part summarizes the original 
Macro-Micro multifactor yield-curve investigation in Ch. 48 - 50. The second 
part deals with further Macro-Micro developments in Ch. 51. The third part in 
Ch. 52 deals with a related topic that is called a “function toolkit”. 
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I. The Macro-Micro Yield-Curve Model: Ch. 48-50 


The Macro-Micro model originated from a program at Merrill Lynch in the late 
1980's to describe yield-curve movements using a multifactor model'. The most 
important point to retain is that the data for yield-curve movements are consistent 
with long-term Macro quasi-random “quasi-equilibrium” smooth behavior, 
around which rapid but generally small strong Micro mean-reverting fluctuations 
take place^?. This is very different from the behavior produced by standard 
finance models. 

The quasi-random quasi-equilibrium Macro behavior has an associated macro 
cutoff time scale below which changes in the Macro component do not happen. 
This is very different from any Brownian model (with or without mean 
reversion). Thus, a sort of spectral time decomposition is implied with different 
properties at short and long time scales, governed by different dynamics. 

The different dynamics of the Macro and Micro components of the 
underlying variable movements has a natural and attractive physical 
interpretation. The Macro component can be associated with a moving quasi- 
equilibrium governed by the response of the markets to long-term 
macroeconomic considerations. The Micro component can be regarded as the 
result of market fluctuations following the macro trends. The Micro component 
also contains occasional fat-tail jumps. In this way, the Macro-Micro model fits 
in both with macroeconomics and with trading activity. 


Use of SSA to determine the Macro Component in the Macro-Micro Model 


Recent work introducing Singular Spectrum Analysis (SSA), described in Ch. 36, 
allows a new approach to determining the Macro component of the MM model. 
The idea is that SSA is capable of determining trend behavior (here the Macro 
component) and eliminating noise (here the Micro component). 


' Acknowledgement: I thank Les Seigel, former Senior Adviser and Manager, Financial 
Technical Assistance in the Treasury Group at the World Bank, for very lively and 
insightful conversations on this and many other topics. 


* *Quasi-Random Quasi-Equilibrium Yield Curve Path", *Historical Quasi- 
Equilibrium Yield Curve Path": The nomenclature is meant to invoke several ideas. 
“Equilibrium” signifies that mean-reverting fluctuations exist around an equilibrium path, 
stable on time scales long with respect to these fluctuation times. “Quasi-Equilibrium” 
signifies that the equilibrium path changes slowly with time. “Quasi-random” means that 
such future slow changes are drawn from a random distribution that does not scale down 
to small times. Historically, it is assumed that one particular realization of this quasi- 
random behavior took place to form the “Historical quasi-equilibrium yield curve path". 


> Fat- Tails: Occasional fat-tail jumps also occur. 
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Outline 
The model is summarized in the next three chapters, presented in historical order. 


Ch. 48. The Multifactor Lognormal Model and Yield-Curve Kinks 


The first chapter! considers the multivariate generalization of the single-variate 
lognormal (LN) interest-rate model with eleven factors, using input historical 
data for volatilities and correlations. This is a natural model to consider. 
Historically, it was one of the first multifactor yield-curve models. Today, similar 
models are still used in some risk management contexts. 

A variety of statistical techniques was used to analyze the data and the model, 
including one of the first uses of principal components on the Street. 

The multifactor LN model produces simulated yield curves containing 
unphysical yield-curve shape kinks. We believe that this is a general property of 
models without strong mean reversion. 

To drive the point home, here is a picture: 


Multifactor LN and 
Undesirable YC Kink 
Probably Other Models | *—* 


Yield-curve kinks present a potential major modeling risk issue. Usually the 
problem is ignored. The avoidance of undesirable yield-curve kinks constitutes 
an important challenge to any putative multifactor yield-curve model. 


Ch. 49. The Strong Mean-Reverting Yield-Curve Multifactor Model 


The second chapter in this series involves the construction of a multifactor strong 
mean-reverting multifactor model that replaces the unsuccessful LN model". This 
strong mean-reverting multifactor model does successfully describe yield-curve 
statistical data, and without the presence of undesirable yield-curve kinks. 

A statistical tool called Cluster Decomposition Analysis (CDA) is described 
that was used to discover the need for strong mean reversion. CDA uses third- 
order correlation Green functions. Using this sharp probe, the yield-curve data 
are observed to imply the existence of very strong mean reversion and small, 
rapid fluctuations about an historically determined slowly varying quasi- 
equilibrium yield curve. 

Here is a picture outlining the process determining the multifactor strong 
mean reverting Micro Model". 


^ Statistical Methods: These methods were also used for probing the multifactor LN 
model. In the figure, Principal Components (ef, ev) means that both eigenfunctions and 
eigenvalues were used to compare model output with the data. 
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Determination of Strong Mean-Reverting Micro 
Multifactor Yield Curve Model 


Micro Input Macro Input 
Historical Vols, Correlations Quasi-equilibrium yield curve 


Statistical Methods 
Cluster Decomposition Analysis 
Third Order Green Functions 
Principal Components (ef, ev) 
Standard Statistical Probes 


Output Quantitities 
Micro Model Curve Shape Volatilities 
Strong Mean Reversion Avoid Yield-Curve Kinks 
Output vols, correlations 


Ch. 50. The Macro-Micro Yield-Curve Model 

The third chapter contains the construction of the Macro-Micro yield-curve 
model for the generation of future yield curves needed to price options and 
perform risk analysis". There are two generic time scales, in which the dynamics 


are very different. 
Here is a diagram of the construction of the Macro-Micro model: 
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Macro-Micro Multifactor Yield Curve Model for 
Risk Management 


Macro Input 
Quasi-random quasi-equilibrium yield curves 
Implied Macro volatilities, correlations 
Macro interpretation via macroeconomics 
Long Time Scales 


Micro Input 
Strong Mean Reverting Multifactor YC Model 
Occasional Jumps 
Micro interpretation via Trading 
Short Time Scales 


No-Arbitrage Constraints 
Yield-Curve term-structure constraints 
Forward stock price, FX for other markets 


Output Quantitities 
Realistic Yield-Curve Shapes, Statistics 
Paths do spread out in time 
Option Prices 


For long Macro times, Macro quasi-stochastic variables produce quasi- 
random means, thus allowing future yield-curve paths to spread or fan out in a 
smooth fashion over long times. 

For short Micro times, the multifactor strong mean reversion model is used. 
This maintains consistency with the successful description of historical yield- 
curve data without kinks. The fluctuations from the data and the micro model that 
fits the data imply interest rate paths that do not fan out much due to the strong 
mean reversion. 

A third component is due to the occasional fat-tail jumps. This third 
component is added on to the Micro component separately. It is possible that 
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these fat-tail jumps are associated with nonlinear diffusion or chaos, as described 
in Ch. 46. It should be emphasized that these effects, while important, are 
additional in this framework. | 

In a sense, the Macro component acts as a WKB approximation" to the full 
set of fluctuations. The Micro fluctuations are the small fluctuations around the 
Macro WKB approximation. 


Interpretation of the Macro Component 


We propose an interpretation of the Macro slowly varying quasi-equilibrium 
yield curves as due to correspondingly slowly varying macroeconomic trends 
(Fed. policy, inflation...). 


Interpretation of the Micro Component 


The rapid Micro fluctuations are proposed due to trading activities following the 
smooth Macro trends and reacting to market events. Occasional fat-tail jumps are 
due to exceptional market movements. However, the dominant Micro dynamics 
are strongly mean reverting. 


II. Further Developments for the Macro-Micro Model: Ch. 51 


Some further developments for the Macro-Micro Model are in Ch. 51, including: 


e Using SSA to determine the Macro Component 
e Intuition: Short to Long Times - Volatility, No-Arbitrage 


The Macro-Micro Model Applied to the FX and Equity Markets 


We summarize some preliminary analyses that indicate the relevance of the 
Macro-Micro idea to FX and equities. These include: 


e Strong Mean Reversion and Cluster Decomposition Analysis 
e Probability Analyses for FX and the Macro-Micro Model 


Formal Developments for the Macro-Micro Model 


We present some formal developments for the Macro-Micro model. These 
include a discussion of hedging, consistency with forward quantities, term- 


structure constraints, and no-arbitrage. A general class of parameters {4} is 


introduced to parameterize the Macro dynamics. Included: 


e The Green Function with Specific Quasi-Random Drift 
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e Averaging the Green Function over the Macro Parameters 
e Option Pricing with the Macro Parameter - Averaged Green Function 
e Тһе Macro Parameter - Averaged Diffusion Equation 


No Arbitrage and the Macro-Micro Model 


We discuss various aspects of no arbitrage for the Macro-Micro model. Although 
the MM model is a real-world model, it nonetheless enjoys some no-arbitrage 
properties. As mentioned in Ch. 50, the basic idea is to have a sufficient number 
of parameters that can be fixed from no-arbitrage considerations. Included are the 
following topics: 


e The Macro-Micro Model, Hedging, and No Arbitrage for Equity Options 
e The Macro-Micro Model and Satisfaction of the Term-Structure Constraints 
for Interest-Rate Dynamics 


Other Topics for the Macro-Micro Model 


Some Remarks on Chaos and the Macro-Micro Model 
Technical Analysis and the Macro-Micro Model 

The Macro-Micro Model and Interest-Rate Data 1950-1996 
Data, Models, and Rate Distribution Histograms 

Negative Forwards in Multivariate Zero-Rate Simulations 


Finance Models Related to the Macro-Micro Model 


e Derman's Equity Regimes and the Macro-Micro Model 
e Seigel's Nonequilibrium Dynamics and ће Macro-Micro Model 


Macroeconomics, Economics Literature, and the Macro-Micro Model 
The Fed and the Macro Interest Rate Component 

TIPS (Bertonazzi and Maloney) 

Currency Crises (Kaminisky/Reinhart, Omarova) 

Time Scales in FX (Feiger and Jacquillat, Blomberg, Chin) 


III. A Function Toolkit: Ch. 52 


In this last chapter (Ch. 52), a toolkit of functions is presented potentially useful 
both at long and at short time scales. The functions were originally used in 
describing some phenomena in high-energy physics and also in engineering. 
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Possible Additional Macro Component for Cycles 


In this chapter, the function toolkit is introduced and suggested for analyzing 
business cycles operating over long time scales. Specific topics are: 


e Time Thresholds; Time and Frequency; Oscillations 
e Relation of the Function Toolkit to Other Approaches 
e The Full Macro: Quasi-Random Trends + Toolkit Cycles 


The Micro Component (Trading) 


In this section, we suggest that the function toolkit might find some applications 
in trading. 


Technical Analysis and the Macro-Micro Model 


We briefly describe trading technical analysis, and propose a qualitative 
connection with the Macro-Micro model. 
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48. A Multivariate Yield-Curve Lognormal Model 
and Yield-Curve Kinks (Tech. Index 6/10) 


Summary of this Chapter 


This chapter is the first of four in this book dealing with the Macro-Micro Model. 
The motivating idea was to construct a multivariate yield-curve model that can 
successfully describe the statistics and dynamics of the yield curve as it moves in 
time. To this end, putative models were compared to yield-curve data using a 
battery of statistical tools. This chapter"! contains the first attempt, a multivariate 
lognormal (LN) interest rate model. This model is the natural generalization of 
the popular single-factor LN model. 

While not the final product, the examination of the multifactor LN model is 
useful if only because it highlights the problems in describing yield-curve 
statistics. The next chapter, Ch. 49, presents a successful description of yield- 
curve dynamics in terms of strong mean reversion about a slowly varying quasi- 
equilibrium yield curve. Occasional fat-tail jumps are also present. This model 
then forms the Micro component of the Macro-Micro model, described in Ch. 50. 

Because the goal is to fit historical yield-curve statistical data, no-arbitrage 
drifts are irrelevant at this stage. Instead, an historically based quasi-equilibrium 
yield curve is introduced around which fluctuations occur. In the Macro-Micro 
Model of Ch. 50, a set of quasi-equilibrium curves is generated by quasi-random 
variables to form the theoretical Macro component. The determination of an 
overall average drift through no-arbitrage considerations becomes relevant for 
pricing contingent claims. The no-arbitrage properties of the Macro-Micro model 
in various markets and recent developments of the model are examined in Ch. 51. 


' History and Acknowledgements: This chapter is mostly taken from Ref. i. The work 
was done in 1988-89 in collaboration with Alan Beilis. I thank Alan for his collegial and 
dedicated work over many years with me. Alan had the seminal idea of the multifactor 
lognormal model in early 1988. This was among the first multifactor models that directly 
modeled the yield curve. It may have been the first yield-curve model to have a separate 
factor for each maturity. 
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The Problem of Kinks in Yield Curves for Models 


Some yield-curve data properties are extremely difficult for models to reproduce. 
The most difficult is to include the large magnitudes of non-parallel movements 
without anomalous large local inversions (kinks), as described by the statistical 
properties of spreads of neighboring maturities. 

Important anomalies of the multivariate lognormal model with respect to the 
data are discovered. Namely, the multivariate lognormal model yield curves are 
not smooth enough and do contain kinks, even when data are used for volatilities 
and correlations. 

We believe that avoiding undesirable kinks in yield curves presents a major 
challenge for all putative yield curve models’. Usually the problem is overlooked. 
Note that one or two factor models cannot generate kinks, since three points on 
the yield curve are required to make a kink. The idea is illustrated below. 


Local Inversions or Kinks in the Yield Curve 


I. Introduction to this Chapter 


The single-factor lognormal model is often used for interest-rate-risk analysis and 
contingent-claims. The single factor is taken as the short interest rate. This model 
can be constrained to be arbitrage-free using initial yield-curve constraints”. 

The point of view in this chapter is that if the short rate is assumed 
lognormal, then logically speaking, rates corresponding to other maturities could 
be assumed lognormal as well. This is a generalization of a two-factor lognormal 
model sometimes used in pricing mortgage-backed securities". 

Ideally for risk management purposes, it is desirable to have a model that 
provides reasonable yield curves moving forward in time that possess statistical 


? Kinky Yield-Curve Challenge: /t is a challenge for any multifactor model purporting 
to describe yield-curve dynamics to avoid kinks in forward-time yield curves. | have met 
other quants who told me that their multifactor models also contain yield-curve kinks. I 
believe that these anomalies are generic to all models without very strong mean reversion. 
A strong mean-reverting model without yield-curve kinks is described in the next chapter. 
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properties in agreement with actual data. For example, this would insure that the 
correct spread statistics between the 10-year rate driving a prepayment model and 
the short-term rate used for discounting cash flows in a mortgage-backed- 
securities framework would be incorporated. 

This chapter presents the first part of the results of a research program carried 
out at Merrill Lynch around 1988-89. The work was dedicated to the construction 
of a multivariate yield-curve simulator with two properties: (1) The simulator 
agrees with the statistical properties of yield-curve data, and (2) The simulator 
can be used for interest-rate-risk analysis and contingent claims valuation. The 
lognormal multivariate model is the first step along this direction. The next two 
chapters present steps toward a more successful model. 

The reason we discuss this rather ancient model is: 

1. Itis a prototype for multifactor real-world model candidates 

2. Other more advanced models will have similar characteristics 

3. The extensive data vs. model analyses here contain lessons for the future. 


Statistical Tests: Brief Description 


The first statistical probe tests for the presence of local yield-curve inversions, or 
kinks (we distinguish between smooth yield-curve inversions and kinks). These 
kinks or local inversions are relatively rare and of small magnitude in actual data, 
and so the test is a sharp discriminator for models. The rest of the statistical 
probes are the means and volatilities of neighboring-maturity yield-shifts and 
finally the usual interest rate volatilities for different maturities, as well as the 
interest rate correlation matrix between different maturities. 

The second part of the analysis, EOF Analysis or Principal Component 
Analysis, breaks down the yield curve into its component movements”. Roughly, 
but not exactly, these movements correspond to parallel shift, tilt, flex, and 
behaviors that are more complex’. 


Data Used in the Analysis 

We applied these statistical probes to characterize weekly U.S. Treasury yield 
data over various time windows, namely 2, 5, and 10 years’. We focus here on 
the data window of 5 years. 


> Standardization of Principal Components: Today, many years after the writing of the 
original paper (ref. i), principal components are standard. When this work was done, the 
technique was almost unknown in finance. For this reason a description of the framework 
was included in the paper, which is retained in the book. 


* Acknowledgements - Data: The yield-curve data were current as of the writing of the 
original paper (ref. 1) in 1989. The data are US treasury yields, including coupons. These 
data have been presented in numerous talks, and are in the CPT-CNRS preprints (refs. i 
and v). I thank Merrill Lynch for the use of the statistics and graphs from these data. 
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Note that these days it would be more relevant to use swap rates. Because of 
the relation between a swap rate and a bond (cf. Ch. 8), the analysis presented 
here is in fact relevant. 

The yield curve was taken as consisting of eleven maturities between 3 
months and 30 years. The notation is shown in the figure below: 


Multifactor Yield Curve Notation 


Maturities: 3M, 6M, 
1Y, 2Y, 3Y, 4Y, 5Y, 
TY, 10Y, 20Y, 30Y 


Maturity indices: 


1,2,3,...,11 


The same statistical probes were used to analyze the output yield curves 
generated by the multifactor lognormal model. 

The historical volatilities and correlation matrix, averaged over suitable time 
windows, were used as input to the model. 


The Historical Quasi-Equilibrium Yield-Curve Path 


Since we were interested in the statistical properties of the yield curves, a simple 
model for the historical trends of the yield curve was used as input. This gives the 
model the best chance at describing the yield-curve fluctuation data. The trend 
model just consists of several long-time straight-line trend segments. We call this 
the historical “quasi-equilibrium yield-curve path”. 


Summary of the Results 


The most striking way of presenting the results is simple visual examination’. 
Fig. 1A exhibits a three-dimensional plot of weekly yield-curve data over the 
period 1983-1988. Fig. 1B shows one typical yield-curve path (of all maturities) 
over time, from the Monte-Carlo simulation of the multivariate LN model. The 
path starts from the initial data yield curve. The reader can see that the lognormal 
model yield-curve path exhibits a variety of unphysical features, including 
unacceptable local yield curve inversions or kinks°®. Again, we believe that this 
problem constitutes an important challenge for any realistic multifactor yield- 
curve simulator. 


? Figures: All numbered figures are at the end of the chapter, and are taken from Ref. i. 


* See the Next Chapter 49: These yield-curve kink problems are largely eliminated with 
the introduction of strong mean reversion. 
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The Rest of the Chapter 


The remainder of the chapter is organized as follows. In Sections ПА, IIB the 
results from the statistical tests are discussed, including the problem of yield- 
curve kinks. Section III presents the analysis of the data and model yield-curve 
movements using (EOF) principal components. Section IV describes the 
reduction of the eleven-variate lognormal model into a simpler model with three 
variates. Section V summarizes the chapter and briefly describes the next two 
chapters in the series‘. The appendices contain details and equations 
complementing the discussion of the text. Appendix A defines various quantities 
and contains a description of stochastic dynamics for the multivariate lognormal 
model. Appendix B reviews the methodology for the decomposition of the yield- 
curve movements into EOF or principal components. 


IIA. Statistical Probes, Data, Quasi-Equilibrium Drift 


In this section, we describe the statistical probes used to test the multivariate 
lognormal model with data. These include mean shifts and fluctuations between 
rates of adjacent maturities along the yield curve, means, volatilities, and 
correlations. A very important measure of the yield-curve shape is the amount of 
local inversions, or "kinks", in which the yield of a longer maturity 1s lower than 
the yield of a shorter maturity but where the yield curve itself is generally not 
inverted. Large kinks as opposed to smooth inversions are unphysical, usually 
representing uncharacteristic shapes not seen in the U.S. Treasury data. Avoiding 
kinks turns out to constitute a sharp test that is very difficult for models to pass. 
Even if the average time-series statistics (e.g. volatilities, inter-maturity 
correlations) are correct, the way the yield curves actually look can be wrong. 

All equations are given in the appendices. Statistical quantities are defined for 
the data, as usual, over a given time window. For the simulator, quantities are 
calculated for each yield-curve path and then averaged over all paths. We 
distinguish between the "short end" of the yield curve, which we denote by 3 
months to 1 year and the "long end", 2 years to 30 years. 


Yield-Curve Data 


Fig. 1A shows weekly Treasury yield-curve data from 5 years (1983-88). Fig. 2A 
shows the time-averaged statistical properties. These data are the mean shifts 
Hj; (Т; AT) between rates of adjacent maturities, the standard deviations 


O shif (T SAT ) of the shifts between rates of adjacent maturities, rate volatilities 


o, (T), correlations o (3M,T) of the 3-month rate with rates of other 


maturities, and correlations р (lOyr,T) of the 10-year rate with rates of other 
maturities. Again, the maturities are labeled by the maturity index. 
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The results for these data show that the mean shift between the 3 and 6-month 
rates is about 35 bp/yr while the standard deviation of this shift is around 20 
bp/yr. Other results can be read off the graphs. One feature was the 30-year rate 
that was consistently below the 20-year rate. 


Input Volatilities and Yield-Curve Correlations 
The short-end rate volatilities are around 0.15/4/yr while the long-end rate 


volatilities are around 0.13 / Jyr . We have chosen the convention of calculating 


volatilities for yield movements normalized by the initial yield at the beginning 
of the time window. The units are appropriate for lognormal models. 

The historical correlation matrix between yield movements of different 
maturities breaks up into several distinct parts. There is a clear division between 
the short and long ends. The short end (by itself) is highly correlated. Short rate - 
short rate correlation coefficients are р > 0.75. The long end (by itself) is also 
highly correlated with long rate - long rate correlation coefficients о > 0.9. 
However, the short end and the long end are only loosely correlated, with short 
rate - long rate correlation coefficients around p = 0.4. 


The Input Historical "Quasi-Equilibrium" Yield-Curve Path for the Model 


A critical input to the Monte-Carlo simulators is the drift for each yield, defined 
by the corresponding mean in the stochastic equation (cf. App. A). The drift 
defines the Historical Quasi-Equilibrium Yield Curve Path, denoted as 


Historical 
Quasi-Equil. Path 


also extremely useful. The details of the description we will obtain depend on the 
precise nature of these definitions; the overall features will not. We write 18 using 


(tT ) . While the definition of these trend regions is subjective, it is 


time change notation d : 


3 
Bonn Path (6T) = Y A Dt, [e(t Lea ) = O(t Á f. )] (48.1) 
к=] 


We describe the trends of these 5 years of data as three regions к = 1,2,3 of 
time lengths (Dt) between transition points {А , with Dt, =t, —t,_,, during 


which interest rates first trended upward, then downward, and again upward. The 


7 Notation: Please do not confuse these quasi-equilibrium drift parameters A,! with the 
eigenvalues of the principal components, discussed later. As usual, © = 1 for positive 
argument, 0 otherwise. 


* Fractional changes: Alternatively, fractional rate changes could be used. 
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reader can see from Fig. 1A that this is a reasonable general description of the 
interest rate trends in the data. The parameters u Dt, in Eqn. (48.1) were 


chosen to reflect these trends?. 


IIB. Yield-Curve Kinks: Béte Noire of Yield Curve Models 


Consider a yield curve rising in maturity at a given fixed time. If there is a local 
inversion in maturity, the yield curve is not monotonic in maturity, and contains a 
kink. 

The kinks can be characterized qualitatively by two numbers: (1) The 
percentage of cases in which there was a kink; (2) The average size of the kink. A 
kink between the 3 month and 1 year rate occurred in the data 796 of the time. 
This means that the 6-month rate was below the 3-month rate 7% of the time, 
with a small average kink size, the difference between the 3 and 6-month rates, 
when there was a local inversion, of around 0.8 bp/yr. Other maturities yield 
similar results. Except for the 30-year rate that was consistently below the 20- 
year rate, average local inversion sizes are all less than 7 bp/yr, and most are on 
the order of 1-2 bp/yr. 

Fig. 1B shows one path (#100) from the multivariate lognormal 
simulation'?'!, Other paths exhibit similar characteristics. This model produces 
large kinks. In particular, typical average kink sizes were around 20 bp/yr. Kinks 
occurred 30-50% of the time in this model averaged over paths. These numbers 
are far in excess of the small and rare kinks in the data. 

Fig. 2B shows the results for the time and path averaged statistics from the 
multivariate lognormal model. Most of them are reasonably in accord with data, 
including the output correlations, consistent with the input correlations. The 
output short-maturity vols are a little high. However, the most noteworthy point 


is that a large discrepancy exists for Су (T SAT ) , the standard deviation of the 


mean shift between adjacent maturities, which is far too large in the LN yield- 
curve model. This is a reflection of the large kinks generated by the model. 


? Generalization of the Quasi-Equilibrium Path in the Macro-Micro Model: In the 
Macro Micro model, described in Ch. 50 and 51, these drifts are taken as quasi-random. 
They then define quasi-random quasi-equilibrium yield-curve paths. 


' Why Path Number 100? We simply chose this path number at random for presenting 
results. There was nothing special about it, and other paths gave similar conclusions. 


! Kinks in Forward Rate Models? I have subsequently heard that the more modern 
forward rate multivariable models can also generate noticeable kinks in the forward rate 
curves as time progresses. I believe that this problem is general. It is difficult to get 
reliable information, as most people ignore the issue. 
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Example of Consequence of Kinks for Risk Management 


An important significance of kinks is incorrect spread statistics between different 
maturities. For example, the long rate used in Mortgage-Backed Securities 
calculations to drive a prepayment calculation for fixed-rate mortgages will take 
incorrect jumps with respect to the short rate used in the discounting procedure, 
resulting in anomalous prepayments for a given path in the MC simulation. 


Kinks and Arbitrage 


Formal arbitrage possibilities can be created if the kinks are large enough to 
produce negative forward rates. Negative forward rates imply that zero-coupon 
bonds with longer maturity have prices above zeros with shorter maturity. 

To illustrate, a kink originating from a local inversion of 92 bp would 
produce a higher price of a zero-coupon bond with a maturity of 11 years 
compared to a zero with maturity of 10 years with a spot rate of 10%. A local 
inversion of 340 bp would produce a higher price of a zero with a maturity of 15 
years compared to the same 10-year zero. 


Histogram for Fluctuations of a Given Rate 


A further characterization of the model is found in the statistical distribution or 
histogram corresponding to a given rate. Figs. 3A and 3B show the histograms 


for the fluctuations of the 10-year rate OT qyemations (£, T 210 yr) and those of 
the 3-month rate, respectively. It is important to note that these fluctuations are 


measured with respect to the historical quasi-equilibrium path ea T (t, T ) 


defined above. The distributions of the data and of the lognormal model are 
shown. The lognormal model generates fluctuations that are much too wide 
compared to the data. This is because the paths generated by the model fluctuate 
away from the quasi-equilibrium path. For one path, the width of the model 
distribution is not far from the data width. However, the model distributions are 
generally displaced from the center of the data distribution, the total model 
distribution over all paths is very wide. Thus the lognormal model discrepancy 
with data is due to paths wandering too far from the historical quasi-equilibrium 
path. We discuss the point further in the next two chapters. 


Attempted Modifications Without Strong Mean Reversion 


We attempted a variety of modifications to try to improve the multivariate LN 
model. One modification included memory effects to attempt to reduce the kinks 
and fluctuations by recall of the initial smooth yield curve; others were 
smoothing recipes. In spite of our determined effort over a rather long period, 
none of these efforts (without strong mean reversion) was successful. 
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We also examined a multivariate Gaussian model. Similar conclusions were 
obtained. 


III. EOF / Principal Component Analysis 


A useful characterization of yield curve movements is given by the empirical 
orthogonal function (EOF) analysis, also called principle component analysis 
(and other names)". This analysis has long been standard in applied physics and 
engineering. It was, to the best of our knowledge, first applied to yield-curve 
analysis by Garbade in 1986, and independently around the same time by 
Herman ^"^", The movements of the yield curve are decomposed into 
independent movements; technically these movements are orthogonal. These 
movements correspond to reasonably down-to-earth descriptions. 

The leading movement usually corresponds to a parallel shift of the yield 
curve (though not exactly, and occasionally not even qualitatively). The next 
most important movement can roughly be described as a tilt of the yield curve 
about some maturity as a fulcrum. Following this is a series of movements of 
decreasing importance. 

The physical picture is analogous to the decomposition of the motions of a 
violin string into simple harmonic motions or modes, each with its own 
frequency, making up the rich sound. The movements are described by 
eigenfunctions and their importance by eigenvalues. See Appendix B. 


Principal Components of the Yield-Curve Data 


Fig. 4A shows the first three eigenfunctions of the yield curve movements in the 
data, labeled parallel shift, tilt, and flex. Indeed, the "parallel shift" eigenfunction 
does resemble a parallel shift. The "tilt" eigenfunction resembles a twist of the 
yield curve, and the flex is a more complex motion. There are (for the 11 variates 
used in the multifactor modeling here) eight more eigenfunctions representing 


? Notation: EOF analysis is usually referred to as principal component analysis. Jerry 
Herman (and also David Haan), who introduced this technique at Merrill Lynch, used the 
name EOF analysis (this is the nomenclature used in geophysics and meteorology). They 
adapted EOF to yield-curve analysis. 


13 Acknowledgements: Jerry Herman was the manager of Merrill's Financial Strategies 
Group. The PhD Quantitative Analysis Group that I managed in 1987-89 was a subgroup. 
Jerry was an unusual and excellent manager who understood quantitative issues well. I 
thank him for many discussions and assistance. 


^ History: Very early, at a seminar I was giving at Merrill in 1986, Jerry Herman 
correctly identified the incorporation of yield curve dynamics as a major unsolved issue 
in models, and he developed a yield-curve model in 1987. Jerry's work was the lynch pin 
that led to the Macro-Micro model described here. 
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movements that are even more complicated. The sum, with the correct weighting, 
gives back the average yield-curve movements over the time window of 5 years 
on which we focused. 

The relative importance or weighting of the various contributions is given by 
the eigenvalues. An eigenvalue measures the amount of yield-curve fluctuations 
characterized by the corresponding eigenfunction. The first three eigenvalues for 
the data are in the approximate ratio 36/8/1. Higher-order modes have smaller 
eigenvalues and characterize the fine details of the fluctuations. 

Figs. 4B show the same graphs for the lognormal simulation. 


A Principal-Component Simulator 


Another interesting Monte-Carlo model, originally proposed by Garbade ™, is to 
assume a lognormal model for yield-curve time changes along each of the EOF 
eigenfunction directions rather than using the maturity-correlation matrix 
technique used in the text and described in Appendix A. We have investigated 
this and found similar results. In particular, the multivariate lognormal model 
based on this modified technique still produces anomalous yield-curve kinks. 


IV. Simpler Lognormal Model with Three Variates 


We have dealt in this chapter with an eleven-variate lognormal model. The 
choice of 11 variates is not mandated. For applications, a simpler model 
composed of, e.g. two or three variates, is preferable. Such a model can now be 
constructed from the eleven-variate model. 

As an example, we have examined a three-factor reduction (i.e. three 
maturities: long, medium, short). As mentioned in the text, the three-factor model 
is the simplest one that is still capable of being tested for kinks. Again, note that 
by definition a two-factor model cannot have kinks. A kink requires three points 
on the yield curve to exist at all. 

One can follow several procedures for this reduction. First, one can extract 
the paths for these three maturities from the paths generated by the model for all 
11 maturities. Second, one can start the process over and model the three 
maturities directly. Third, one can by average over variates (e.g. long-end 
averaging over, say, the 7 year to 20 year rates to produce a proxy long rate 
driving prepayment models). Fourth, one can use the three leading EOFs. The 
results for these various procedures are statistically similar when averaged over 
paths, but of course will not be equivalent path by path for a given maturity. 


5 Acknowledgement: І thank Sheldon Epstein for conversations on principal 
component simulators. 
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Kinks in the 3-Variate LN Model 


The problems of the eleven-variate lognormal model do not disappear when this 
reduction is made. One of the most serious problems, as mentioned many times 
already, is the presence of kinks. Kinks large enough to produce negative forward 
rates were in fact generated in some cases of the three-variate lognormal model"®. 


V. Wrap-Up and Preview of the Next Chapters 


We have presented a methodology for examining and characterizing yield-curve 
dynamics and have obtained what we feel is important insight into the stochastic 
properties of yield-curve movements. The various statistical tests and the EOF or 
principal component yield-curve analysis have led to the conclusion that the 
multivariate lognormal model, while a straightforward generalization of the 
popular single-variate lognormal model, fails in some important regards. 


First Improvement: Using Strong Mean Reversion (Ch. 49) 


The next chapter Ch. 49 describes an improved model using strong mean- 
reversion for fluctuations, which removes the problematic yield-curve kink 
anomalies. As a result, we believe that interest rates, as exhibited in historical 
data, do not exhibit the statistical properties of a stochastic process that spreads 
out in time. The same slowly varying quasi-equilibrium historical yield-curve 
path is used, around which the fluctuations occur. 


Further Improvement: The Macro-Micro Model (Ch. 50, 51) 


Ch. 50, 51 contain the description of the Macro-Micro model. This model can be 
used for pricing contingent claims and sophisticated interest-rate risk analysis. 
The basic innovation is that the historical quasi-equilibrium yield curve path is 
generalized to possess a special quasi-random nature. The Macro-Micro model 
contains explicit time scales. An interpretation of slowly varying quasi 
equilibrium paths is given in terms of long-term trending quasi-random 
macroeconomic effects on interest rates. The rapid fluctuations about a quasi- 
equilibrium yield-curve path is given in terms of trading activities. 


^ Generality: Negative forwards аге a potential problem in any multifactor model whose 
underlying variables are composite and not the forwards themselves, unless strong mean 
reversion is present. Another model based on zero-coupon rates (also without strong 
mean reversion) that generates negative forwards is described in Ch. 51. 
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Appendix A: Definitions and Stochastic Equations 


In this appendix, we give some definitions of quantities used in the text and 
present the stochastic equations defining the multivariate lognormal model. 


Define the T -year maturity rate at time ¢ for path а as r, (f ,T). Data will be 
denoted using the same symbol without the path label, r(t,7). For some 


function x ofthe rates and the time, the differential change over a small time df 
is written as 


dx [r, ((, T). t] - u, [r, (t, T), t] й+о [r, (t, T), t] dZ, (t, T) (48.2) 


Here, u, [r, (£, T) ,t] and o, [r, (1, T) , t] are the (possibly) rate-dependent 
mean or drift and volatility for maturity Г at time t along path o. The Gauss- 
Wiener process is defined as usual as having zero mean when time-series is 
averaged over an infinite time window for path a, 


(dZ, (t,T)), =0 (48.3) 


The cross-correlations among maturities give the correlation matrix, which for an 
infinite time window is independent of the path, 


(dZ, (tT, ) dZ, (t, T; )), - p(T, ,T; ) dt (48.4) 


This is solved in terms of independent normal distributions Бу!” 


dZ, (tT) = FATT INO (5 P5 ar (48.5) 
= 
Here 
PET Ey Ay re (48.6) 


Also, NDS (t,7') is the random number drawn from an N(0,1) normal 


distribution for the yield with maturity 7'', at time ¢ on path æ. This equation 
can be checked by squaring it and using the independence property 


(NOT ANDA UIS eu 


s (48.7) 


7 Real-World Issues: See Ch. 24 for a description of the problems of getting A in 
practice from the data. 
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Correlation Formalism in Two and Three Dimensions 

А 18 _ : _ 2 е 
Call the correlations Cy = Dy, with Sj = 1— Pi (where maturities are 
labeled by i, j). Then if d,x, = u,dt+o, dz, with (dz, а, = py dt we can 
define independent measures (dw). These measures satisfy the orthogonal 


condition (dw, dw,) = д, dt. We get d,x, = udt +с,У A; dw, , where the 
j 


correlation matrix is p = A A". This can be solved directly by orienting 
the d,x, variables simply with respect to the {dw,} variables, taking successive 


products of two d,x, variables to get the correct correlations, and then insuring 
that the normalizations of each of them is correct. We get 


ах = йй +0, dw, 
ах, = шй +0, | с, dw, +5, dw, | 


C53 — Сә Сз 
d,x, = й+о, | с, dw, + |225 


12 


| dw, + A,, dw, (48.8) 


Here the “triangle function" is 


1 2 2 2 12 
A, = — | L606 = б» + ZC бүз Ca | (48.9) 
12 


Means and Volatilities 


For a given time window and a given set of paths, we write the time-averaged 
and path-averaged drift and squared volatility as 


AT) = (и, [ (T). 1), 


: | (48.10) 
o (Т) = (о Wes) Ale 


Averages for the data are defined the same way except, of course, there is no 
path averaging. For the LN case, the time change d,x up to a convexity 


correction is 


5 Correlation Formalism: Correlations were discussed in detail in Ch. 22-25. We retain 
the discussion given in the original paper here in order to make the discussion self- 
contained in this chapter. 
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d,x[r, (t, T) . t] 2d;r, (t,T)/r, (t,T) (48.11) 


Spread Shifts between Rates of Adjacent Maturities 


The shift or spread between yields of adjacent maturities T and T + АТ on the 
yield curve, at fixed time on a given path, is defined as 


A,r, (GT; AT) =, (T AT) - v, (t,T) (48.12) 


Here, AT = AT(T) is the maturity increment between defined nearest-neighbor 
maturities on the yield curve (e.g. 10 years between the 20 year and 30 year 
rates). Averaging A,r, (t, T; AT ) over time and paths, with averaging notated 


by oe , gives the mean spread or shift, 
Hag (TAT) = Apr, (GT; AT)),, (48.13) 


The spread or shift squared volatility is given by 


o? (T;AT) = Da (T; An) E и, (T; AT) (48.14) 


ty 


The Kink Definition 


We define a yield-curve "kink" when a yield of higher maturity is lower than a 
yield of lower maturity. While this does not yield much useful information in the 
case of an overall-inverted yield curve, it certainly is useful in discriminating 
between models when the yield curve is normal and increasing. As described in 
the text, kinks play a major role in formulating sharp tests between various 
models. 

The mean shifts and volatilities for kinks are defined exactly as before using 
Eqns. (48.13) and (48.14), with the additional filter that the yield difference 
between neighboring maturities is negative, thus producing the kink in the first 
place, i.e. 


A,r, (t,T;AT) < 0 (48.15) 


Appendix B: EOF or Principal-Component Formalism 


In this appendix we briefly describe the Empirical Orthogonal Function (EOF) or 
principal component expansion analysis or eigenvector expansion used in the text 
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Y. This expansion has other names, including ће Karhunen-Loeve expansion, 
modal analysis, etc. We first construct the matrix F consisting of N 
observations in time (labeled by n = 1...N ) of the time changes of each of M 
different rates (labeled by m = 1... M ). We had eleven variates, so M =11. We 
therefore write 


I =r (tra Ta) -r (tTn) (48.16) 


n? m 


The MxM covariance matrix R is then constructed as 


R i = + Y s Pu E У EY Tos (48.17) 


R = On O m'Pmm' (48. 1 8) 


тт 


where с is the m" volatility. We now look for the M solutions labeled by 
В = 1...М to the eigenvalue equation for R , namely 


M 
RV = AP у) (48.19) 
1 


m'= 


The quantities y” for a given fj can be thought of as the components of an 
M -vector; this vector is called an eigenvector or with some mathematical abuse, 
an eigenfunction". The number AC is called the eigenvalue associated with this 


M 

eigenvector. The sum of the eigenvalues is A” ) = M. This is easily seen in 
p=l 

the case that the matrix is diagonal with off-diagonal elements equal to zero (total 

non-correlation’’) with each A? = 1. The eigenfunctions for the three largest 

eigenvalues for the data and for the lognormal model are plotted against the 

maturity index m in Figs. 4A and 4B, respectively. 


? Eigenvector or Eigenfunction? I use these terms interchangeably, even though these 
are not functions. The advantage of using “eigenfunction” is that in notes "ef" can be 
used and is not confused with the abbreviation “ev” for eigenvalue. 


"Random Gaussian Noise: Each eigenvalue being 1 result of infinte N and finite M for 
M series constructed of uncorrelated Gaussian random numbers. The theory of such 
series generally is called “Random Matrix Theory” or RMT. 
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We have the completeness and orthogonality relations for the eigenfunctions, 


B 


We also have the decomposition of the original matrix as 
ч (£) 108) 
c (B) 
T agi E yy A V (48.21) 
В=1 


We can decompose the individual rates as mentioned in Ch. 45, 


M 
= (8) y2) 
ДА БУ оа (48.22) 
pal 
Here, ys ) is given by the inverse formula, 


M 
pa Е У (2, a) (48.23) 


т=1 


We сап also define, for each time, 


YP =YP (t,)= VAP Wier (t, Tp) = yor (48.24) 


m'= 


: : th = 
We write the time-averaged т rate (r) = (r (t, ids )) uno — Then, 


(r), => (ү yi (yy- y (r) yP (48.25) 


We can reconstruct the amount of movement of the yield curve "along" the 


(B) 


"direction" of each eigenfunction. Write Eng =Wm as the components of an 


MxM matrix. This matrix is an orthogonal matrix, namely its inverse is its 
transpose. Now define the matrix with elements C,,, depending on the 


eigenfunction label 2 and the time index n , as 
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M 
Con Bx [E - da mn Ез Wer, (48.26) 
т=1 т=1 
Then we can define 
fs C, n YP = =y У vP Ep n (48.27) 


т'=1 


This gives the component of the measurement of the T, maturity yield 


m 


change at time f, along the B" eigenvector y ). The time average of the 


differenced rate at fixed maturity is (Dr). = We can then 


du dunes ed : 
write the various time-averaged components of the yield shifts as 


(DIP) LESE =P EvE (Dr) 082 
n=l 


т'=1 


These equations сап also be written using the projection operator 5p go ) which is 
PP) — = yy UD (48.29) 


mm 


It is useful to define the scalar quantity (D, y '\ using 


(or) = Y y P (Dr). (48.30) 
With this definition, we have 
(Dy?) - vi (ny (48.31) 


It is useful to get the composition of the average rate move for a given maturity in 
terms of the principal component average moves with different 3. This is 


(Dr) = Y Dye) (48.32) 
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Simpler Three Principal Component Example 


All this formalism may seem unintuitive. As the canonical simple example”', we 
take three principal EOF components. These are generally called “parallel”, “tilt” 
and “flex”. Three components suffice for most yield-curve movements most of 
the time”. Say there are three points on the yield curve т = 1,2,3 corresponding 
to 2, 5, 10 years. The eigenfunctions and scalar principal components are 


І l І 
y P ЕН 2 ES 1 (B=Tilt) __ 1 0 үе) = 1 E (48.33) 


Bl IU Cm КЗ 


(ру иы H Ion... ш (Dr). + (рр), | 


-mej l 
(рне) =—| (Dir) „p -2(р,), + (Dr hoy | (48.34) 


For simplicity, we set Dr, = (Dr), , py) = (py). The vector quantities 


т 


in maturity) аге given by equations in which the projection matrices enter: 
t given by equat hich the project trices ZP? епі 


1 1 үр» Dr, + Dr, + Dr, gue 
=ParallelShift 1 
Dy ^^ HelShifr) _ + 1 r 1| D5 |=—| Dr + Dr, x Dr, | for | m-2 
3 2 
1 1 1A Dn Dr, + Dr, + Dr, m=3 


*! Canonical Butterfly Example: As a physics colleague Henry Abarbanel once wisely 
remarked, *One Example is Worth Two Theorems". This is the canonical principal 
component example. The flex component is called a “butterfly”. The parallel component 
is an exactly parallel shift, and the tilt component is the “Twos-Tens” tilt. 


? Are just three EOF principal components really enough? Usually, but not always. 
For example, if some desk has some complicated strategy with a messy yield-curve 
composition, three components are not enough. Therefore, if you structure your risk 
management system to include only three principal components you may miss some 
important risk on the desk that is invisible to your measuring probes. 
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1 0 -1Y( Dr, Dr, — Dr, m=1 
руё" Цо о o Dr, |=— 0 for | m=2 (48.35) 
: -1 0 1 /)\Dro i —Dr, + Dr, m -3 
1 -2 1) De Dr, — 20, + Dr, m=1 
py") _ -2 4 -2| Dr, |=—| -2Dr, - ADr, -2Dr, | for | m=2 
1 -2 Тур Dr, —2Dr + Dr. m=3 


The 2-year rate time averages difference, for example, can be reconstructed as” 


(D, ғ), , = ( D y(B=Paralletshift) ) " ( [з жш, ) n | DY! pw 


t^ 2-yr 2-yr 2-yr 
= xls + Dr, + Dro |+ 52% — Dr, |+ «Los -2Dr, + Dr, | (48.36) 


- Dr, 


Real-Life Principal Components 


In general, we will not get these simple forms from the data, but they will be 
reminiscent of the canonical example. For example, in 1992 the flex component 
was 


(ny "9)-l016(D7),  -0831(D7), , +0.57(Р,г)у„_„] (4837) 


The First 3 Principal Components from the Data 
Fig. 5A shows the initial and final yield curves for the period 5/88 to 3/89. Fig. 


5B shows the first three (Dy Peat) ) (DY C ) (D,Y ee) ) time- 


averaged components of the yield shifts over this period. Roughly, as can be seen 
from these graphs, yield movements over this period did behave as a parallel shift 
along with a clockwise tilt around the middle of the yield curve. A small amount 
of flexing was also present. 


? Notation: Please note that the <DY> quantities here are not the same as ће <DY> 
quantities on the previous page. 
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Anomalous Principal Component Behavior 

Occasional anomalous behavior sometimes does occur where the first 
eigenfunction that we label with “parallel” does not in fact correspond to a 
parallel movement. 

To illustrate, Fig. 6A shows a breakdown in small intervals of data between 
the dates 2/19/88 and 1/30/89. Fig. 6B shows the components of the yield shifts 
over the period 2/19/88 to 5/13/88. The leading component labeled "parallel 
shift" is not at all parallel but rather tilted opposite to the second component 
labeled "tilt". The reason for the anomalous behavior was connected to negative 
correlations between the short and long ends of the yield curve during this period. 


Picture — An Ellipsoid 
An M -dimensional ellipsoid is the relevant geometry here. The biggest ellipsoid 


dimension has length equal to ү 202" where А" = max [4^ ) | and is in 


(Bmax) 


the direction of the corresponding eigenvector y with components 


wi? =) The next biggest ellipsoid dimension corresponds to the second 
largest eigenvalue and its eigenvector. And so forth. 
Picking out two dimensions with labels f , B, and AU? > Д) we have an 


ellipse with major and minor axes of lengths AU LAU in the orthogonal 


directions pn. y P? | 


*Sub-Period Centered” Principal Components 


We have been using the convention of subtracting out the time-means of the 
in Eq. (48.17), Le. 


' 


variables P over all times with n=1...N to define R 


mm 


using 4L, =N РЕ m? SO LL = = Dr). гаш А.т As P = Up A us 


n-l 
However баі is not the only possible procedure. We can use a different 
subtraction. The result will not be the covariance matrix, but it can be useful, e.g. 


measuring deviations from the mean of a sub-period™, indexed by x =1...N 
with N'« N. 


Define the mean у over a sub-period, viz 


2 Sub-Period Centered PC Analysis of Climate Change: It makes sense to look at how 
available temperature data differ from the 20" century mean temperature in order to 
analyze temperature during the 20" century. See Ch. 53. 
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N 
V = т, К (48.38) 


к=1 


We obtain the corresponding quadratic form 5 „ as 
N 
Som =A 2, Кы Foy Vu (48.39) 


We obtain the eigenvalues of S { y e and the corresponding eigenvectors 


(9? | with components D | . 


The set of S eigenvectors forms a complete set (as do the eigenvectors of 
R), so either set of eigenvectors can be used. This is the crucial point. 
We can calculate the overlap between the eigenvectors of R and of S. 


M 
Defining ( uy (8) ) = 3 LW (A ) etc., after some algebra we obtain 


m-l 


_ AX» 


(v6) rs rr A wem o] (48.40) 


Figures: Multivariate Lognormal Yield-Curve Model 


Fig. [1]. Fig. 1A shows the three-dimensional plot of weekly U.S. Treasury 
yield-curve data over the period 1983-1988. Fig. 1B exhibits a yield-curve path 
(consisting of eleven maturities) from the Monte-Carlo simulation for the 
multivariate lognormal LN model, starting from the same initial yield curve as in 
Fig. 1A. Note that the LN model yield curves have kinks not seen in the data. 


Fig. [2]. Statistical measures used to characterize the yield-curve time series 
are shown. The maturity index 1 to 11, represents the 3M, 6M, and 1, 2, 3, 4, 5, 7, 
10, 20, 30-year rates, respectively. Figs. 2A show data for spread shifts between 


neighboring maturities. These are the shift means) Mp (T ; AT) and standard 
deviations Opi (Т ;АТ ) at the top. The yield volatilities) о (Т) are in the 
center. At the bottom are the correlations) о (3M ,T') for the ЗМ rate with other 


rates, and the correlations р (10yr,7) of the 10-year rate with other rates. Figs. 
2B show the same graphs for the time and path-averaged lognormal simulation. 
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Note that the vols of the shift spreads are too large in the LN model as compared 
with the data. 


Fig. [3]. Figs. 3 show histograms of the fluctuations of rates, measured 
around the slowly varying quasi-equilibrium yield curve) discussed in the text. 
Figs. 3A and 3B show the fluctuations of the 10-year and of the 3-month rates for 
the data compared to the lognormal simulation. Note that the LN fluctuations are 
too wide compared to the data. 


Fig. [4]. Principal component EOF analysis (see Appendix B). Figs. 4A show 
data for the first three eigenfunctions) plotted against maturity, labeled “parallel 
shift”, “tilt”, and “ех”. Note that the “parallel shift" is not exactly parallel. The 
eigenvalues are also listed. Figs. 4B show the same graphs for the LN simulation. 


Fig. [5A]. Treasury yield curves in 5/88 and in 3/89. 


Fig. [5B]. The first three principal components of the movements of the yield 
curve between 5/88 and 3/89 (see App. B), labeled parallel, tilt, and flex. 


Fig. [6A]. Five treasury yield curves showing anomalous behavior. 


Fig. [6B]. The principal components of the data between 2/19/88 and 
5/13/88. Note that the “parallel shift” component is not parallel. See text. 
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Data Statistics (1983-88) 


MEAN SHIFT BETWEEN ADJACENT MATURITIES STD. DEV. OF MEAN SHIFT BET. ADJACENT MATURITIES 


SHIFT (BP) 


VOLATILITY 


MATURITY INDEX 


CORRELATION COEFFICIENT OF THE 3 MONTH RATE CORRELATION COEFFICIENT OF THE 10 YEAR RATE 
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Figure 2A 
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LN Model Statistics 


MEAN SHIFT BETWEEN ADJACENT MATURITIES STD. DEV. OF MEAN SHIFT BET. ADJACENT MATURITIES 
Vol of neigboring 
ais spread differences 
= Hag (Т; АТ) too big LN model. 
® = 
= = 
= E 
т 
[7] 
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R y $ 5 ] 2 : : : Н 
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Figure 2B 


Chapter 48: A Multivariate Yield-Curve Lognormal Model 


767 


Fluctuations Around Historical Quasi-Equilibrium 


Path for LN Model vs. Data (10-yr Rate) 
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Fluctuations Around Historical Quasi-Equilibrium 


Path for LN Model vs. Data (3-Month Rate) 
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Principal Components for LN Model 
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Treasury Yield Curves 5/88 and 3/89 
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Principal Components for Treasury 


Yield Curves (5/88 - 3/89) 
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Yield Curve Flattening Behavior 
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Principal Components during anomalous 
period 2/88 — 5/88. Note that the 


“parallel shift” is not parallel. 
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49. Strong Mean-Reverting Multifactor YC Model 
and the 3 Order Green Function (Tech. Index 7/10) 


Summary of this Chapter 


This chapter! is the second in the series in this book on modeling the empirically 
observed dynamics of the yield curve. You do not have to read previous chapters 
to understand this chapter. Necessary background will be summarized. 

A strong mean-reverting multivariate model is constructed that agrees well 
with yield-curve data’. This dramatically improves on the multivariate lognormal 
yield-curve model introduced in Ch. 48. Many statistical probes are used in the 
analysis. It should be mentioned that the data for the yield volatilities and 
correlations are input to all the multifactor models we examine. 

One very useful probe, Cluster-Decomposition Analysis (CDA), is 
introduced. It is based on third-order correlation functions in a manner taken 
from theoretical physics, generalizing skewness. The difficult data properties to 
reproduce are the third-order correlation measurements along with the absence of 
kinks or anomalous yield curve shape changes, as described by the statistical 
properties of spreads of neighboring maturities. 

An economical description is achieved by identifying a smooth, slowly 
varying historical quasi-equilibrium yield curve, about which small, rapid 
fluctuations occur with very strong mean reversion. Models without strong mean 
reversion fail to describe yield-curve shape-change statistical data. 

The next chapter describes a generalization (the Macro-Micro Model) that is 
appropriate for valuing contingent claims and for interest-rate-risk analysis. The 
main idea is the generalization to quasi-random quasi-equilibrium yield curves. 


' History: This chapter is based on Ref i. The work was done in collaboration with Alan 
Beilis. The strong mean-reverting approach is, I believe, the first and perhaps the only 
multifactor model that successfully reproduces the statistics of the movements and shapes 
of the yield curve data, and at the same time avoids yield-curve kinks. The important 
thing to keep in mind is that the strong nature of the mean reversion is not an arbitrary 
assumption, but seems to be implied by the data. 


777 


778 Quantitative Finance and Risk Management 


Introduction to this Chapter 


This is the second chapter in a series summarizing an effort aimed at producing a 
realistic yield-curve multivariate model. Models usually have one factor, 
although sometimes two-factor models have been used". Our philosophy, 
described in the previous chapter, is to describe the yields directly with a 
multivariate model’. 


What Should a Realistic Multifactor Yield-Curve Model Accomplish? 


We want to demand that a yield-curve model possess two desirable attributes. 
The main point is that we want the model to reproduce the essential statistical 
characteristics of yield-curve data. We eventually also want the model to be 
capable of pricing contingent claims. The satisfaction of these two requirements 
is highly non-trivial. In the previous chapter, we showed ™ that the multivariate 
lognormal model fails to describe important yield-curve shape statistical 
properties and in particular generates kinks. On the other hand, an extension of 
the model in this chapter will be needed to enable the pricing of options. 


The Cluster-Decomposition Analysis (CDA) 


A sharp statistical probe is introduced, called Cluster-Decomposition Analysis or 
(CDA) from theoretical physics. This analysis, although unfortunately sharing the 
same name has nothing to do with the cluster decomposition in statistical 
regression. The CDA technique uses third-order correlation functions to uncover 
properties of the stochastic processes underlying the yield-curve data. 

The CDA is used as a probe to produce a new model. This new model is 
based on exceptionally strong mean-reversion with rapid fluctuations about a 
moving quasi-equilibrium slowly varying yield curve. While mean reversion is 
an old idea, the way we use mean reversion is new. First, a quasi-equilibrium 
yield curve is defined by the long-time behavior of the yield-curve data. Second, 
the mean reversion is very strong. The mean reversion needed to describe the 
yield-curve statistical data is in fact so strong that the fluctuations are tightly 
bounded around the quasi-equilibrium curve. 


Problematic Yield-Curve Features Described by the SMRG Model 


The strong mean reversion is built into a multivariate Gaussian model; we call it 
the Strong Mean-Reverting Gaussian (SMRG) model". The SMRG model passes 


? Yield-Curve Simulator: The yields for different maturities are modeled as correlated 
stochastic processes. 


? What About Strong Mean-Reverting Models other than Gaussian? The mean 
reversion implied by the yield-curve data is so strong that other models would also no 
doubt work, for example a strong mean-reverting lognormal model. See below. 
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the CDA tests. Models without strong mean reversion fail the CDA tests. This 
failure occurs for both the single-factor and for the multi-factor models without 
strong mean reversion. 

The multivariate SMRG model passes other statistical tests described in the 
previous chapter that models without strong mean reversion fail. These include: 
(1) the absence of large kinks and consistency with the spread statistics between 
yields of neighboring maturities; and (2) the eigenfunctions and eigenvalues of 
the Empirical Orthogonal Function (EOF) or principal component analysis as 
compared to data. Other more common statistical probes are also included, such 
as the yield correlation matrix and yield volatilities. 


Generalization of the SMRG Model to the Macro-Micro Model 


The multivariate SMRG model described in this chapter is not appropriate for a 
simulator to price contingent claims such as options. This is because the strong 
mean reversion about the quasi-equilibrium yield curve needed to describe data 
when placed in a simulator produces tightly bunched paths. The paths do not “fan 
out", and therefore do not sample the range of future interest rates present in 
standard options models that do describe market options prices. 

It is important to understand that the bunching up of paths in yield-curve 
space is not an assumption but rather an apparently unavoidable consequence of 
describing the details of historical yield-curve data. 

In the next chapter" we describe how the SMRG model can be extended to 
produce the Macro-Micro model that can price contingent claims while 
maintaining consistency with the absence of kinks and other statistical properties 
of yield-curve data. The main idea is the generalization of the historical quasi- 
equilibrium yield curve to quasi-random quasi-equilibrium yield curves. 


WKB: Small Fluctuations from a Quasi-Equilibrium Yield-Curve Path 


The SMRG model achieves simplicity in a manner well known from physics. The 
basic concept is to look for a quasi-equilibrium state that varies smoothly and 
from which small, rapid fluctuations occur. An example is the WKB method". 
Our fundamental point is that the same idea works for the description of 
yield-curve data as well. We believe that this fact is both attractive and important. 


Yield-Curve Data Used for the Analysis 


The same data were used as in Ch. 48: weekly U.S. Treasury yield data over 2, 5, 
and 10 year periods, and for eleven maturities between 3 months and 30 years. 
Mostly the analysis focused on the period 1983-88. Input data included the yield 
volatilities and the yield correlation matrix. Note that these days it would be more 
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relevant to use swap rates. Because of the relation between a swap rate and a 
bond (cf. Ch. 8), the analysis presented here is in fact relevant. 


Results of the Numerical Analysis 


The most striking way of presenting the results is simple visual examination. 
Figure 1 exhibits a three-dimensional plot of weekly yield-curve data over the 
period 1983-1988 as well as a path (of all maturities) from the Monte-Carlo 
simulation for the strong mean-reverting Gaussian model, starting from the same 
initial yield curve. The two plots are intentionally unlabeled. The reader is invited 
to distinguish between them’. It is clear that there is statistical similarity and the 
yield curves generated in the model are visually similar to the data. Other SMRG 
model paths yield similar results. 

In contrast, paths from the lognormal or Gaussian models without strong 
mean reversion exhibit a variety of unphysical features, including unacceptably 
large local yield-curve inversions or kinks, as discussed in Ch. 48 and Ref. ш. 


Short and Long Time Scales 


Our main conclusion is that, barring exceptional events like the 1987 crash, the 
yield-curve data can be described in a simple and economical fashion using the 
following construction. There are two time scales, "long" and "short". On long 
time-scales, e.g. months, interest rates drift along slowly time-varying quasi- 
equilibrium means. On short time-scales (e.g. days), fluctuations about these 
means are mainly observed to take place in a bounded region in the 
multidimensional multi-maturity interest-rate space. 

That is, the data do not act as if they correspond to a stochastic process that 
spreads out in time according to standard models. We attribute the failure of 
traditional models to satisfy the CDA and other statistical tests to this property. 

On the contrary, the strong mean-reversion produces an effective lower and 
upper bounding characteristic about the means, with very low probability of the 
rates exceeding these bounds. In particular, no negative or zero model interest 


^ Fifty-Fifty Votes: “Can You Tell the Difference Between the SMRG Model and the 
Data"? Since writing the original paper in Ref. 1, I have given a number of talks on this 
work. I first show the yield-curve data. Then later in the talk I show an unlabeled plot 
containing both the data and the SMRG result for one path. I ask the audience to vote on 
which plot contains the data. The results are on the average 50-50 for those that vote. 
Many people are so unsure that they don't vote at all. More accurately, the voting is 
typically 1/3-1/3-1/3. 


The human eye is an excellent pattern recognition instrument. The fact that trained 
analysts are generally unable to distinguish the difference between the data and the 
SMRG model is a powerful statement that the SMRG model indeed provides a reasonable 
phenomenological description of the data. This is backed up by the successful statistical 
yield-curve analysis. 
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rate values are ever observed in practice in the SMRG model we construct in this 
paper containing this large mean reversion. We believe that this holds more 
generally as independent of the presence or absence of any barriers at zero 
interest rate provided the quasi-equilibrium yield curve stays away from zero 
rates (a lower bound of about 3% is enough in practice)’. 


Remainder of this Chapter 


The remainder of the chapter is organized as follows. In Section П, we describe 
the Cluster Decomposition Analysis. The main results of the paper are also given 
in this section. Section III presents the results from the rest of the statistical tests. 
Section IV summarizes the paper and introduces the next chapter. Appendix A 
reviews the stochastic dynamics for the various models considered. Appendix B 
contains a short description of the cluster-decomposition analysis formalism. 


Cluster Decomposition Analysis and the SMRG Model 


In this section, we describe the Cluster Decomposition Analysis applied to the 
yield-curve data and the models. We find that the SMRG model, i.e. the 
multivariate Gaussian model with strong mean reversion about a "quasi- 
equilibrium" slowly varying mean yield-curve path, is preferred by the CDA 
tests. Multivariate Gaussian or lognormal models without mean reversion 
essentially fail the CDA tests. 

The method of Cluster Decomposition Analysis is based on the following 
observation. Consider first a simple Gaussian or normal probability density 
function. It is specified by its first two moments—the mean and width, or 
volatility. All higher moments (skewness, kurtosis etc.) factor into a sum of 
products of powers of these first two moments. This idea generalizes to a time 


series x(t) , as explained in Appendix B. The mean is just the time average over 
a given window. 

The second-order correlation functions generalizing the second moments are 
defined as follows. Take one photo-copy of a given time series, displace it by a 
lag time with respect to the original, multiply the points of the series and the 
displaced series together over some time window where both series exist, and 


average over time. We call the result the "total two-point function" M,. It is 
convenient to define the "connected two-point function" or auto-correlation 
function Co which is M Й minus the product of ће means. 


? Very Low Rates: This statement should be re-examined for the very low rates that have 
occurred in recent years. 
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Third-Order Correlation Functions 


The new probes we use are third-order correlation functions that are the 
generalizations of the "skewness". Take two photo-copies of the given time 
series, displace them each by a lag time with respect to the original, multiply the 
three points of the series and the two displaced series together over some time 
window where all series exist, and average over time. This produces the "total 
three-point function" M, Now subtract from M, the quantity that M, would 


equal if the process x(t) were Gaussian. The result is the "connected three-point 
function" С,. This function is the analog of the skewness of the distribution for 
ordinary statistics. For a Gaussian stochastic process, the identity C, =0 holds 


exactly. That is, С, is zero for all time lags of the two photocopies. 

The same construction can be used with different time series instead of 
identical time series; for a multivariate Gaussian process defining these time 
series, the connected three-point function (with any of the variates taken for any 
time series) is identically zero. 

Higher-point functions can be defined using more time series; we began to 
look at these but found it unnecessary since it turns out that the three-point 
function suffices for sharp tests. 


How the Third-Order Correlation Function is Used in Practice 

We use these third-order or three-point correlation functions to test for an 
underlying stochastic property of data in the following way. First, make a 
hypothesis regarding the stochastic property. Formulate the hypothesis in such a 
way that the variable used to define a time series is in fact Gaussian by definition. 
Then, for this time series, if the hypothesis is in fact correct, C, will be 
identically zero by definition. We call this the "zero- С, test". If the hypothesis is 
"approximately" correct, С, will be small compared to M... If the hypothesis 
"fails", C, and M, will have comparable magnitude. While (as in all 
phenomenology) subjectivity is involved in the words "approximately" and 
"fails", we believe that the results we obtain are striking enough to speak for 


themselves. 


Specifically, we define an empirical putative Gaussian noise variable dz ^^ 
that we will test to see if it is really Gaussian, which then tests the model. 


Consider the simple model d x= Ша + odz. If this model is accurate, then 


substituting data for the returns and the drift, d x^" = Uu" + с“ dz will hold 
with dz Gaussian. To test this idea, we merely algebraically solve for dz; 
adding a data label dz ^"^ to show we used the data, we have 
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ilies | а x" _ ui dt) 


(49.1) 


data 


data 


We then test the data-defined quantity dz“ using data plugged into the right 
data > 


hand side of this equation, and just check to see if dz“ is (approximately) 


Gaussian. If so, the model is supported. If dz "^ is far from Gaussian, the model 
is not supported. 


data is Gaussian is the 3 order Green function 


The main test to see if dz 
generalized skewness test. 

Although we have illustrated the idea with a simple model, the procedure for 
testing putative models clearly is general. We just need to be able to solve for the 


Gaussian dz from whatever model is being considered, insert data, and then test 
to see if this data-defined dz 


data - n 
^" is really Gaussian or not. 


Mathematical Example 

In order for the reader to better understand the ideas, we first consider one 
realization of a purely lognormal process, L = exp [oN (0,1) | using the 
exponent of a Gaussian random-number generator. We plot one-dimensional 
graphs showing one variable; hence, the times for the starts of photocopies are 
fixed. We fixed these times at 10 weeks from the end of the Gaussian series for 
convenience. The start time for the original time series x(t) is the variable ¢ (in 
weeks from the end of the series). 

The results for the lognormal random-number generator L are shown in Figs. 
2A. When t=10, we expect and indeed we see peaks in M,, C,, and M, 
because then the original and xerox-copied time series "line up". For a discretized 
Gaussian process, these peaks decay rapidly with the time lag (infinitely rapid 
decay as the time window becomes infinitely long); this demonstrates the "short- 
range correlations" of the Gaussian process. However, as mentioned above, de 
cannot have any peak for a Gaussian process and in fact should be zero. Because 
we must choose a finite-sized time window, this is only be approximately 
realized in practice; indeed С, ғ 0 is seen to be true. 


LN Model Analysis Using 3rd-Order or Three-Point Correlation Function 


We now consider actual data viewed as a lognormal process. If we hypothesize 
that the T -year yield follows a lognormal process, we would construct from the 
data the time series of the temporal shift in this T -year yield divided by the 7 - 


year yield, i.e. d,x( t ) 2 d,r(t ;T ) / r(t;T ). According to the hypothesis we 
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just made, x(t) is supposed to follow a Gaussian process. To check the 


hypothesis, we merely construct the quantities М, C,, M, and C, for this 
putative Gaussian process. If they look like the results in Figs. 2A generated by a 
Gaussian process, then there is evidence that the T -year rate follows a lognormal 
process. If on the other hand С,/ M, is not small, the hypothesis of a lognormal 


process for r( t ;T ) is not supported. 


Figs. 2B shows the result of applying the lognormal hypothesis to the 10-year 
rate, where the time (in weeks) goes from 5/88, back towards 9/83. The two-point 
functions М, апа C, do peak at ten weeks. This means that the short-time 
correlations characteristic of the hypothesized Gaussian time series seem to be 
present. However, the ratio of the three-point functions С,/ М, is not at all small 
as it should be if the hypothesis of lognormality of the 10-year rate were true. We 
regard this as a failure of the 10-year rate generated by the lognormal model to 
satisfy the zero- C, cluster-decomposition analysis test. 

The same CDA tests were constructed for other maturity yields, and for 
cross-maturity correlations. Similar results were obtained. In particular, the zero- 
C; test fails in every case for the multivariate lognormal model. 

We could also hypothesize that the yield curve movements are purely normal 


or Gaussian. Here, x(t) is just assumed to be the short-term rate, or as proxy, a 


short-term yield. However, similar results to the lognormal case are obtained here 
for all functions М, C,, M, and C, across maturities. Again, the zero- C, test 
fails, with C, again only slightly smaller than M, 


The Historical Quasi-Equilibrium Data Yield Curve and Mean Reversion 


Another possible stochastic process involves mean reversion. The mean reversion 
can be taken about the historical quasi-equilibrium yield curve. As mentioned 
many times, the historical quasi-equilibrium yield curve is the path of drifts for 
each yield over long time regions. Our purpose here is to decipher the historical 
statistical properties of the underlying yield-curve dynamics. This complicated 
task is facilitated by inserting a simple description of the quasi-equilibrium yield 
curve historically and then examining the fluctuations carefully. 

While the details of the description we will obtain depend on the precise 
nature of these definitions, the overall features will not. However, not all 
definitions are equivalent. Choosing a drift which does not "reasonably" 
reproduce data trends will lead to a definition of fluctuations that will be more 
complicated than that given here, since some of what "should" be described as 
trend is mixed in with what "should" be described as fluctuation. 

The historical quasi-equilibrium yield curve is not the no-arbitrage yield 
curve defined by the drifts producing consistency with bond price data. We want 
to perform a comparison of the SMRG model with yield-curve data. The moving 
averages of the data were what they were, and were naturally not bound by any 
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considerations of today's yield curve used to produce future yield curves through 
no-arbitrage. No-arbitrage term-structure constraints will be discussed in Ch. 51. 


Historical Quasi-Equilibrium Yield Curve Path for Data 


The goal is neither to produce many descriptions, nor to prove that some 
description is unique (since it is not). We want a simple and useful description. 
This goal is achieved by describing the trends of the five years of data as being 
composed of three regions « =1,2,3 during which interest rates first trended 


upward, then downward, and again upward. 


K 


As in Ch. 48, we introduce rate slope parameters ui over macroscopic 


time intervals (Dt, =t.- = between transition points {ЙА , and write 


3 
di Path (t, T) = У A, Dt, [Ө(ї = bet ) ш Ө (ї m f, )] (49.2) 
к=1 


The reader can see from Figs. 1 that this is indeed а reasonable general 
description of interest rate trends. The trends do not capture the fine details of the 
fluctuations; the SMRG model will accomplish that task. The drifts of the yields 
in these three regions are included in the Monte-Carlo models as inputs, thus 
guaranteeing that the overall trends will be correct. 

Conversely, if "unreasonable" drifts not reproducing trends are input, the 
performance of every simulator we tried deteriorated. 


SMRG Model Analysis Using 3rd-Order Correlation Function 
The mean-reverting Gaussian model can be turned into a form that can be 
examined by the above methodology. See Appendix A. One defines d,x(t) by 


adding to the temporal shift of a T -year yield а mean-reverting term defined by 
multiplying a mean-reversion parameter by the difference of the T-year yield and 
its mean defined above. If the hypothesis of a mean-reverting Gaussian process 
for the T -year yield is correct, the presumed Gaussian process using the data in 
the stochastic equation will indeed satisfy the mathematical properties of a 
Gaussian process. 

Figs. 2C shows the 2" and 3' order correlation functions under the 
assumption of Gaussian mean reversion for the 10-year rate. As opposed to the 
previous cases, with a sufficiently large value for the mean-reversion parameter, 
C, is now seen to be quite small, relative to М , Indeed, all functions now look 


quite similar to the pure Gaussian process of Figs. 2A; the zero- С, test now 
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seems quite well satisfied. We take this as evidence that the 10-year rate closely 
follows a strongly mean-reverting process. 

The CDA tests were performed over time windows of 2, 5, and 10 years and 
across the yield curve, all with the same conclusion. 


Large Value of the Mean Reversion in this Model 


The value of the mean-reverting parameter (9 does appear to be very large. For 
values of о less than about 1/week, failure for the zero- C, test is obtained. A 
transition occurs at around (9 = 3 / week, which leads to positive results (Figs. 
2C), so we use this value. Values up to о = 5/week may be acceptable as 
described in the next section’. 

An interesting result is that, although it might have been true that the mean- 
reversion parameter could depend on maturity, similar values seem to hold for all 
maturities. Moreover, similar parameters hold for cross-maturity correlation 
functions. 


Strong Mean Reversion - Incorporation into Other Models 


We have examined a strong mean-reverting Gaussian process. The mean 
reversion can be combined with any volatility assumption (Gaussian, lognormal, 
CIR model” etc.). We believe that similar results would hold for other models. 
This is because barriers at zero interest rate are largely irrelevant for strong 
mean-reverting processes, if the quasi-equilibrium yield curve path is sufficiently 
away from zero rates. We find that the data do satisfy this criterion for the quasi- 
equilibrium yield curve. Large mean-reverting processes in practice only very 
rarely lead to interest rate paths anywhere near zero; hence the details of the 
presence or absence of a barrier make little difference’. 


The SMRG Model is Quite Different from the Vasicek Mean Reverting 
Model, or the usual MRG finance model 


Mean reversion about the quasi-equilibrium yield curve is quite different from 
the original application of Vasicek “" that had mean reversion about one fixed 
rate (at infinite time). It is also quite different from the mean-reverting-Gaussian 
term structure model described in Ch. 43. 


* Mean Reversion Effects on Interest-Rate Fluctuations: We have checked the effect 
on interest-rate fluctuations due to the strong mean reversion. A mean-reversion 
parameter of 3-5/week typically produces mean-reversion-induced changes in rates per 
time period on the order of 25% of the changes due to random fluctuations. 
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Fat Tails Producing Jumps/Gaps are an Extra Component 
It is interesting to speculate on the non-zero part of С,, forming a correction to 
the SMRG process we have been considering. We believe that this correction is 
connected to "crashes" like that of October 1987 producing long tails on the 
interest-rate process. While the idea of fat tails is not new (Fama ™ discussed the 
problem in 1965), our version is different. The tails in our view are just a small 
remnant of the total, with most of the dynamics controlled by the strongly mean- 
reverting process about the quasi-equilibrium yield curve. The tails may or may 
not give rise to a small component of infinite-variance Pareto statistics in the 
dynamics; this is in any case difficult to examine with only a few crashes. We 
note that for the October '87 crash, the interest-rate fluctuation was less than 2%. 
We consider the point a bit further in the next section’. 

In Ch. 46, we introduced the Reggeon Field Theory as a possible mechanism 
for generating such fat tails. 


Troublesome Aspects of Standard Approaches to Interest Rate Dynamics 


Our results imply that interest rates, as exhibited in historical data, do not behave 
as corresponding to a standard stochastic process that spreads out in time. Such 
processes include the standard zero (or small) mean-reversion models in common 
use. Such models fail the zero- C, test, as we have seen. It is a challenge to any 
putative multifactor model to come to grips with the statistics of yield curve 
movements. 


Physical Picture - How Interest Rates Really Seem to Behave 


From our analysis, interest rates seem to behave as if there were a mean "quasi- 
equilibrium" moving-average yield-curve path, which changes smoothly and 
slowly. The quasi-equilibrium yield-curve path as defined by the yield-curve data 
does move with time and can lead to high or low values for rates, thus producing 
high or low trends in a smooth way. High or low values only succeed each other 
over relatively long time scales. 

We believe that these results are a-priori reasonable. Interest-rate fluctuations 
are in fact tightly constrained about a moving mean, as a glance at the data in the 
bottom part of Fig. 1 will show. The magnitude of these rapid fluctuations can be 
estimated by eye from the data to lie in a region of about +1% about the 
smooth, slowly varying quasi-equilibrium rates. There are almost never large, 
sudden interest-rate fluctuations over small time scales. That is, the fluctuations 


’ Fat Tails are Important but are Not the Dominant Feature for most Yield Curve 
Movements: This topic is treated in some detail in Ch. 21. Note that the fat tails do not 
affect the main conclusion, that most of the dynamics corresponds to small strongly 
mean-reverting fluctuations around a quasi-equilibrium slowly moving yield curve. The 
fat tails produce jumps from time to time that constitute an extra component. 
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in the data do seem essentially bounded. The essentially bounded property is 
consistent with and is implied by the large mean reversion we find. 

We believe that the words "smooth", "slow", and "long" can in practice be 
defined with respect to time scales of, e.g., several months. Actually, we believe 
that an infinite number of time scales exists. We are merely dividing the spectrum 
up into two parts, a low frequency part corresponding to the slow variation of the 
quasi-equilibrium means and a high-frequency part corresponding to the small 
rapid fluctuations. This can be made more precise using Fourier analysis. 


Other Statistical Tests and the SMRG Model 


In this section, we briefly describe other statistical probes used to test the models 
and data. They are the same as in Ch. 48 and in Ref іп. xxxThese include mean 
shifts and fluctuations between rates of adjacent maturities along the yield curve, 
means, volatilities, and correlations. As mentioned in Ch. 48, a critical measure 
of the yield-curve shape is the amount of local inversions, or kinks. These occur 
when the yield of a longer maturity is lower than the yield of a shorter maturity 
but where the yield curve itself is not inverted. Large kinks are unphysical, and 
the need to avoid kinks constitutes a sharp test that is very difficult for models to 
pass. 

Equations are given in Appendix A. Statistical quantities are defined for the 
data, as usual, over a given time window. For the simulator, they are calculated 
for each yield-curve path and then averaged over all paths. 

Figs. 3A shows the time-averaged statistical properties of weekly Treasury 
yield-curve data (9/83 to 5/88), repeated from Ch. 48. These are the mean shifts 


Шт (T; AT) between rates of adjacent maturities, standard deviations 
O shif (T SAT ) of the mean shifts between rates of adjacent maturities, rate 


volatilities с, (T), correlations о (3M,7) of the 3-month rate with other rates, 
and correlations р (10yr,7) ofthe 10-year rate with other rates. 


Repeat: Problems with Kinks in Models without Strong Mean Reversion 


The local inversions or kinks can be characterized qualitatively by the percentage 
of cases in which there was a kink and the average size of the kink. A kink 
between the 3-month and 1-year rates occurred in the data only 796 of the time. 
Other maturities yield similar results. Except for the 30-year rate that was 
consistently below the 20-year rate at the time, average local inversion sizes are 
all less than 7 bp/yr, and most are on the order of 1-2 bp/yr. 

The multivariate lognormal model described in Ch. 48 produced anomalous 
and large kinks not seen in the data. The spread statistics were also not in 


agreement with data. In particular, the spread-shift volatilities с, (Т ;АТ ) 
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were too large. The aspect that is being measured is the same that leads the LN 
model to fail the CDA tests; the fluctuations are not tightly constrained. 

The multivariate Gaussian model without mean reversion gives results 
similar to the multivariate lognormal model. Again, the fluctuations are not 
tightly constrained. Strong mean reversion appears to be critical. 


Results for the SMRG Model Description of Yield Curve Data 


The Strong Mean-Reverting Gaussian (SMRG) Model exhibits much better 
properties. One path, taken arbitrarily as #100 but typical, is the top yield curve 
in the unlabeled Fig. 1 along with the data. The sequence of random numbers 
generating this path is the same sequence that generated the lognormal path 
shown in a similar graph in Ch. 48. The statistics corresponding to the data are 
shown in Figs. 3B. The situation is vastly improved over the non mean-reverting 
models described in Ch. 48. The model kink sizes are now typically on the order 
of (and only slightly larger than) those of the data. The spread statistics, along 
with the other statistics, are in quite reasonable agreement with the data. 


F tuctuations (t. T= 10yr) 
of the 10-year rate generated by the SMRG model, the fluctuations of the 3- 
(tT =3M ) , and the fluctuations of the corresponding 


Figs. 4A, 4B show histograms of the fluctuations 6 


month rate dr 


fluctuations 


data. All fluctuations are measured away from the historical quasi-equilibrium 


Historical 


yield-curve path 7 pu Path (ї,Т ). In contrast to the models without strong 


mean reversion, the agreement between the SMRG model and the data is good. 
This means that the fluctuations produced by the SMRG model are realistic. 


Results Including a Fat Tail Gaussian Component 


As mentioned before, evidence exists in the data for fat tails. Fat tails form an 
important, though temporally occasional, phenomenon. Fat tails are by definition 
not described by the SMRG model. Evidently, we need to add fat-tail effects into 
the description. In the language of the Macro-Micro Model, fat tails need to be 
added on as a correction to the Micro component". 

To get an idea of how that might work phenomenologically, we investigated 


: ; А А Large Gaussi ; 
including a small Gaussian amplitude d hita  -. for rate changes, with a 


Fat Tails 
large width, representing around 1096 of the fluctuations. 

We first considered a modified SMRG model with an even larger mean 
reversion @ = 5/ week representing 90% of the fluctuations. The distribution of 
the 10-year fluctuations of this modified SMRG model shown in Fig. 4C is now 
somewhat too narrow compared to that of the data. 


* Macro Fat Tails? See Ch. 51 for a suggestion that macroeconomics and fat tails are 
connected. There may be different types of fat tails for micro and macro dynamics. 
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Fig. 4D shows the combination of the fat-tail Gaussian added to the modified 
SMRG model. The agreement with the data is improved with this composite 
description. 


How Many Free Parameters Are We Using to Describe the Yield Curve 
Fluctuations? 


The description we use is actually quite parsimonious. We emphasize that, aside 
from modeling the fat tails, there is really only one free parameter in the SMRG 
model, namely the value of the strong mean reversion. All other parameters that 
characterize the rapid fluctuations are fixed by yield-curve data. 

It is striking that the same value of the mean-reversion parameter (around 
3/week) found in the cluster-decomposition analysis also serves, without further 
adjustment, to produce model statistical properties in agreement with data. If this 
parameter is reduced, both the cluster decomposition analysis and the statistical 
tests deteriorate. As the mean reversion becomes small, the problematic results 
reappear. Indeed, the mean reversion was determined by increasing it from zero 
and choosing that value such that the CDA tests began to succeed. 


Maturity and Time Independence of the Mean-Reversion Parameter 


It is striking that the mean-reversion parameter seems roughly independent of 
maturity. We have also done the analysis with different time windows with 
similar results, so the mean-reversion parameter does not seem to depend 
strongly on the time either. 


Other Attempts to Fit the Data without Strong Mean Reversion 


As mentioned in Ch. 48, we attempted models without the strong mean reversion 
but including other effects. This included memory effects to attempt to reduce 
kinks by recall of the initial smooth yield curve, smoothing recipes, etc. 

In spite of a determined effort over a rather long period, no effort to replace 
strong mean reversion was successful in describing the statistical properties of 
the yield curve, passing the CDA tests, and avoiding kinks. 

We therefore believe that strong mean reversion is essential. 


Principal Components (EOFs) and the SMRG Model 


As described in Ch. 48, a useful characterization of yield curve movements is 
given by the EOF or principal component analysis". Principal component analysis 
decomposes yield curve movements into orthogonal components along 
eigenfunctions of the covariance matrix with associated eigenvalues. 
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Numerical Results 


Figs. 5A shows the first three eigenfunctions of the yield curve movements in the 
data, labeled parallel shift, tilt, and flex. The first three eigenvalues for the data 
are in the approximate ratio 36/8/1. 

The results of the EOF analysis for the multivariate SMRG model are given 
in Figs. 5B. Both the eigenfunctions and eigenvalues are in good agreement with 
the data. 


Wrap-Up for this Chapter 


We have continued and extended in this chapter our methodology for examining 
and characterizing yield-curve dynamics. The Cluster Decomposition Analysis 
(CDA) leads to sharp tests for models. Standard models fail these tests. Strong 
mean reversion about a slowly varying "quasi-equilibrium yield-curve" mean 
seems to be a preferred dynamical mechanism. We have considered a 
multivariate strongly mean-reverting Gaussian model. We believe that similar 
results would hold for other models as long as they include strong mean reversion 
(e.g. strong mean-reverting lognormal, strong mean-reverting square-root, etc.). 

An additional, but small, component is connected to crashes or jumps like 
that of October 1987, producing fat tails on the interest-rate process. 


Summary of the Physical Behavior of Interest Rates over Long Times 


Recap: Our results imply that interest rates, as exhibited in historical yield-curve 
data, do not behave as if they correspond to a stochastic process that spreads out 
in time according to the statistics of zero (or small) mean reversion models in 
common use. 

Instead, interest rates behave as if there were a mean quasi-equilibrium 
moving-average yield-curve path, which changes smoothly and slowly. The 
quasi-equilibrium yield-curve path defined by the data does move slowly with 
time and can lead to high (e.g. 20%) or low (e.g. 3%) values for rates, as well as 
inverted yield curves, all in a smooth fashion. We believe that the words 
"smooth" and "slow" mean with respect to time scales of, say several months. 
About this smooth quasi-equilibrium path, the actual rates fluctuate with small 
rapid movements. 

The concept of small fluctuations about a slowly varying or quasi-equilibrium 
state is one of the most pervasive and useful ideas in the physical sciences. We 
find it appealing that this idea also seems to be relevant to financial data. 


Physical Set of paths Needs Generalization to the Macro-Micro Model 


The multivariate Monte-Carlo simulator we have presented in this chapter, while 
agreeing with historical data, is not useful for risk management of contingent 
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claims. A physical set of paths requires future interest-rate paths to spread out in 
time. It is in fact possible to obtain a simulator that both agrees with historical 
yield-curve data and has future paths that "spread out". 

The essential ingredient of this extended model, called the Macro-Micro 
Model, is to regard the quasi-equilibrium slowly varying yield curve of the data 
as one realization of a special kind of quasi-stochastic Macro variable effect”. We 
suggest an interpretation of these quasi-random variables as being due to long 
time-scale effects of macroeconomic forces (e.g. Fed. policy). One of these 


quasi-equilibrium paths 15 realized historically, ен ый. (5T ). An 


interpretation of the rapid Micro SMRG fluctuations is regarded as being due to 
trading activity, reacting to individual market events, while following closely the 
overall macro economically produced slow trends. 

The large mean reversion of the Micro SMRG model, coupled with the 
spreading out of future paths due to the Macro effects, can lead to a small total 
"effective" mean reversion. This is similar to mean reversion sometimes used in 
current one or two-factor models. 

However, the big difference is retaining the agreement with yield-curve shape 
statistics not enjoyed by standard models. 


The standard no-arbitrage requirements can be incorporated in the Macro- 
Micro model by finding an appropriate yield-curve path BÉ (t,T ) about 


which fluctuations (both Macro and Micro) occur. This is explained in Ch. 51. 


Appendix A: Definitions and Stochastic Equations 


In this appendix, we repeat some definitions of quantities used in the text along 
with some details of the stochastic equations defining the various models. Further 
details are in Ref. iii. Define the T -year rate at time f for path @ as r, (t,T). 
Data will be denoted as the same symbol without the path label. For some 
function x of the rates and the time, the differential change d,x over small time 


dt is taken as 
d xir, (t ,T),t] = u4,[7,(¢,T),t]dt+o,[r,(t,T),t]dZ, (tT) (49.3) 
The correlations between rates of different maturities give the correlation 


matrix. For the three cases of lognormal, Gaussian, and mean-reverting Gaussian 
(MRG) the function change d,x is given in terms of the rates by 


? Quasi? That adjective is used to instill the idea that the Macro variables will not be 
ordinary Brownian variables that scale down to arbitrarily short times. 
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d x(t ,T) 
d x(t ,T) 


Lognormal m dr(t T) / r(t T) 
-dr(t,T) 
= dr(t ,T) + cr(t ,T)dt 


Gaussian 


Mean-Rev. Gaussian 


(49.4) 


Here oXt,T) is the mean-reversion parameter. A-priori it can be a function of 


both time and maturity, although in practice as mentioned in the text, data are 
consistent with this quantity roughly independent of both time and maturity. For 
the MRG model the drift can be written 


Hage V, (tT) tat = дукат a T) + ®(@ T) pa (5T ) dt 


1 Quasi-Equil. Path Таза Path 
(49.5) 


Historical 


Неге Tos вда, Path (T ) is the slowly varying quasi-equilibrium path defined by 


the data as described in the text, around which the yield-curve fluctuations occur. 
It is similar to А. у (f, T), the "classical path" described earlier in the book. 
The fluctuations are defined with respect to the quasi-equilibrium path: 


DE Lad (t, T) = (t, T) a nod Path (t, T) (49.6) 


The solution to the MRG stochastic equation, as can be checked by 
differentiating, is 


r( t, T) Historical ( t, T) 


аар Path 


$ [ew joan eus («',7)dZ, (rsr)ar' 


to 


(49.7) 


The shifts between rates of adjacent maturities on the yield curve, as in Ch. 
48, are denoted by 


Ar r,(t,T; AT) =r, (t,T+AT)-r,(t,T) (49.8) 


Here, AT is the maturity increment between defined nearest-neighbor maturities 
on the yield curve. The spread-shift volatility о, (T MAT ) is averaged over 


time and paths. 
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As in Ch. 48, we define a yield-curve "kink" when a yield of higher maturity 
is lower than a yield of lower maturity. The mean shifts and volatilities for kinks 
are defined as before, with the additional filter that the yield spread is negative 
producing the kink in the first place. 


Appendix B: The Cluster-Decomposition Analysis (CDA) 


In this appendix, we describe the cluster-decomposition analysis CDA, as defined 
in theoretical particle physics (Ref. xi). This CDA has nothing to do with "cluster 
decomposition" as sometimes used in statistics. 


Consider Г time series x, (t), [y = 1,2,3,‚..‚Г], where some or all series 


may be the same. Call the Г” - order correlation (actually covariance) function of 


these series м! (128:1) Hiz T, ix, dn or мі (75-1) for short. Define { 7, } time 


(123...) 


lags. Then М. is the point-by-point sum of products of lagged variables": 


1 N 
Mem "ES > xT + п At)... x + п At) (49.9) 


п=1 


The mean of а time series x, with lag T, is M” =M, E ix] . The second- 

order and third-order connected correlation (actually covariance) functions called 
(2) _ . (123) _ Я 

С=С» LE e» um x, | and С=С, Е Р ТОР ] are standard 


looking expressions: 


CO? = м -MOMO (49.10) 
3 

co = Mi) _ 3 сем -| [MP (49.11) 
perm (abc) у=1 


The sum in Eqn. (49.11) runs over permutations (abc) = (123), (231), (312). It 


should be emphasized again that these are dynamic Green functions, not simple 
static statistical measures. 


10 Notation: The time interval At is arbitrary. We assume N>>1, so 1/(N-1) ~ I/N etc. 
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Using Time-Differenced Series in the CDA 

The same definitions given above can be made with each x(t) replaced by 
d,x(t) = x(t + dt) – x(t) . The difference is one of application. The use of d,x 
instead of x(t) is that with d,x correlations, time expectations of products of 
Wiener measures occur. With x(t) correlations, time integrals of these 


expectations are encountered. The d,x correlations provide a "strong" pointwise 


test in the time lags, whereas the x -correlations only provide a "weak" integrated 
test. 


We can also use (d,x — ш, ) time series, or (d,x — и, \/o, time series. The 


latter measures the number of standard deviations. 


Checking Model Hypotheses with the CDA 


The main point is that generally for Gaussian processes, and also for lognormal 
processes with distinct indices, the third-order connected correlation function is 
identically zero, i.e. 


(T EC mon cu е0 (49.12) 


The third order function M I1 is not equal to zero for Gaussian processes if the 


means are nonzero, and not equal to zero for lognormal processes, viz 
M? 20 (49.13) 


Therefore, an acid test for the validity of models is to check the relationship 


? 


(123) 
e (49.14) 


(123) 
M; 


Formal Proof of C, = 0 for Gaussian Processes 


The proof of this statement" as well as the general case for higher-order 
correlation functions follows from generating functional formalism. In outline, 
the argument goes as follows. Define the generating functional Z of "currents" 


! Proof with Time Differences: The proof with time-differenced variables goes through 
in the same way. Only the form of the matrix A changes. 
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J, (t, ) for a general Gaussian probability density function (exponentiated 
quadratic function of the x's) by 


Zune П) far, (t,) СЕСИИ (49.15) 


Уп 


—o0 


Here, A is the matrix defining the quadratic form, 


AUGUE we NO Qu Xl aa (49.16) 
у›у';п,п' 
Also, 
ex, = YI Cea) (49.17) 
yn 


Averages are then defined by 


(Ох, (0)... Xe G&)I) = 


+оо 


БУП f ax, e) ats (1)....› x. (t-)] exp {-x-A-X} (49.18) 


Notice that Z can be evaluated by Gaussian integration as 
Z(J) = 20003) ехрі 17.47.7} (49.19) 
Неге, 


Jig J = У J, (t, [ГА ( t, > і )4, (to ) (49.20) 


yy 5n,n' 


Also, notice that a factor x,(¢,) can be included in an average by 
differentiation of Z with respect to the associated current J, ( t, ), and then 


setting all currents to zero, by definition. Thus averages of products of x's can be 
obtained by differentiating the evaluated form of Z by each of the associated 
currents to each x in the product. This serves to evaluate the correlation 
functions defined in the text. 

From the above remarks along with the fact that the integral of an odd 
function over symmetric limits 1s zero, the reader may convince himself after 
some algebra that the connected three-point function is identically zero for a 
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Gaussian probability density function. The cluster decomposition formulae for 
higher-order correlation functions follow from similar manipulations. 


More About the Cluster Decomposition Analysis 


A convenient “bubble” diagram notation” for the third order cluster 
decomposition equation is shown below": 


3" Order Cluster Decomposition Diagram 
SE = SCE + 


To read this, imagine that lines to the left represent “particles” coming “in” 
and lines to the right represent "particles" going “ош”. The symbol C in а 
bubble means that those particles attached to that bubble interact, and M means 
that particles may or may not interact. One line through an M bubble means that 
the particle goes in and out untouched. The terms on the right hand side of the 
equation represent all possible ways that the particles can interact or not interact. 
The equation can also be written with C on the left hand side, as in Eqn. (49.11). 


For Gaussian noise, we have both МЇ?” =0 and С?) =0 (as the number 
of data points M — oo). For Gaussian noise plus a deterministic drift, M gd is 


no longer zero, but C^? = 0 still holds. 


For a lognormal model, an exponentiated form is needed for the rate r. We 
need the replacement 


? “Bubble Notation” for Green function equations: This notation was invented by the 
physicist David Olive. Similar results hold for any arbitrary number of lines in or out. It is 
a great mnemonic to remember the formulae, which otherwise have to be derived using 
the functional techniques, and it always works. 
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oN(0,1) > exp|oN(0,1)] (49.21) 
We wind up with exponentiated Gaussian noise, for which M eas +0. 


Specifically, for lognormal noise with L, = exp| o, N, (0,1) | for index a 


(and similarly for indices b,c ), the Green functions through third order are": 
MI = (exp |o, М, (0.1)]) E exp| o;/2 | 
M® = (ехр[ с, N, (0,1) А (0.1)]) = expl ( 6? +0, +26,,0,0, )/2] 


CCP = exp [Cs +o, Р] | exp(2,,0,6, )- 1| 


(49.22) 
MUS = (exp с, №, (0,1) ag oN, (0,1) 3b oN, (0, 1)]) 
= exp (o +0, +02 *26,0,0, +26,,0,0, + 26,,0,0, ЈА 
ci _ Дон?) иы _ Ee E: 2 perm | + 2) 
(49.23) 


Note that if a,b,c are all distinct then p = 0 since the curly bracket is 


{1 -3- 2) —-0.If a=b and a +c, we get C^? =0 for the lognormal case. 


Figures: Strong Mean-Reverting Multifactor Yield-Curve Model 


Fig. [1]. 3-D plot of weekly yield-curve data over the period 1983-1988 as 
well as a path (of all maturities) from the Monte-Carlo simulation for the strong 
mean-reverting Gaussian (SMRG) model, starting from the same initial yield 
curve. The two plots are intentionally unlabeled. Can you tell which is which?" 


Fig. [2]. Results from the cluster decomposition analysis. Figs. 2A show the 
correlation functions using a set of exponentiated Gaussian random numbers. 
Note that C, /M 4 & 0. Figs. 2B show the same analysis using data along with 


the hypothesis that the fluctuations in the data result from a zero-mean reversion 


P? Notation: 5, = 1 if a = b and 0 otherwise. 


" Answer: The top figure contains the model and the bottom figure contains the data. 
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lognormal process. If this lognormal hypothesis were correct, С, /М, should be 
small. However, C, / М, is not small, showing that the lognormal assumption is 


incorrect. Figs. 2C show the same analysis using data along with the hypothesis 
that the fluctuations in the data result from the SMRG model. Note that С, / М, 


is small, so the SMRG model passes this test. 


Fig. [3]. Statistical measures of the yield-curve time series. Figs. 3A show 
data for the shifts between neighboring maturities. These are the shift means) 


Hag (T; AT), standard deviations ©, (Т ;АТ ), volatilities | o, (T), 
correlations) between the 3 month rate and other rates р (3M,T), and 


correlations between the 10 year rate and other rates р (1Oyr, Т). The maturity 


index runs from 1 to 11, representing the 3-month, 6 month, and the 1, 2, 3, 4, 5, 
7, 10, 20, and 30-year rates, respectively. Figs. 3B show the same graphs for the 
time and path-averaged SMRG simulation. The agreement of the SMRG model 
and the data is quite reasonable. 


Fig. [4]. Figs. 4 show histograms of fluctuations of rates about the historical 
"quasi-equilibrium yield-curve path", the slowly varying moving-trend yield 
curve discussed in the text. Figs. 4A and 4B show the fluctuations of the 10 year 
and 3 month rates for the data, compared to the SMRG model. Fig. 4C shows the 
fluctuations of the 10-year rate for the data compared to a modified SMRG model 
with larger mean reversion. Fig. 4D shows the fluctuations of the 10 year rate for 
the data compared to a composite model consisting of a 9096 component of the 
modified MRG model in Fig. 4C along with a 10% component of a wide 
Gaussian distribution, phenomenologically accounting for the fat tails. 


Fig. [5]. Principal Component EOF analysis. Figs. 5A show data for the first 
three eigenfunctions plotted against maturity, labeled parallel, tilt, and flex. The 
parallel shift is not exactly parallel. The eigenvalues are also shown. Figs. 5B 
show the same graphs for the SMRG model 
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Data and Strong Mean-Reverting Gaussian Model 
Figures intentionally not labelled. 
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Math “Experiment” to Test 
CDA Analysis 


fy 


mnm e m —X — Qs SA ОИЕ 


1.0 


2 POINT FUNCTION 


0.8 


5 a a а : 
2 s| | 2-point CDA (not data) | 


5 10 15 20 25 30 
WEEKS 
З POINT FUNCTIONS - 10 YEAR RATE 


| 3-point CDA (not data) 7 


TIME COVARIANCE 


«| СЪМ, is zero (OK) |... 


5 10 15 20 25 30 
WEEKS 


Figure 2A 
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DATA VIEWED AS ALOGNORMAL PROCESS 
TWO POINT FUNCTIONS : 10 YEAR RATE 
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Figure 2B 
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DATA VIEWED AS A MEAN REVERTING PROCESS 
TWO POINT FUNCTIONS : 10 YEAR RATE 
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Data Statistics (1983-88) 
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Figure 3A 
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Strong MRG Model Statistics (œ = 5) 
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SMRG Model with о = 3 (solid) and Data (dotted): 
Fluctuations from Historical Quasi-Equilibrium for 10 yr rate 
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SMRG Model with o = 3 (solid) and Data (dotted): 


Fluctuations from Historical Quasi-Equilibrium for 3M rate 
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FREQUENCY OF OCCURRENCE 
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SMRG Model with о = 5 (solid) and Data (dotted): 
Fluctuations from Quasi-Equilibrium for 10 year rate 
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Figure 4C 
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FREQUENCY OF OCCURRENCE 
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Principal Components: Data T 
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50. The Macro-Micro Yield-Curve Model 
(Tech. Index 5/10) 


Summary of this Chapter 


This chapter contains a description of the Macro-Micro multi-time scale, 
multifactor yield-curve model '^". The reader can understand this chapter without 
reading previous chapters; necessary material will be summarized. In Ch. 49, we 
showed that yield-curve statistical data are consistent with small rapid 
fluctuations implying strong mean reversion, with fluctuations about an historical 


Historical 


"quasi-equilibrium" yield curve ro, gui Path (T ), parameterized from the 


data". Two time scale regimes are envisioned: short (the Micro component) and 
long (the Macro component). For short Micro times, the strong mean reversion 
model is used, because this agrees with the data fluctuations. A fat-tail 
component is also present’. 

For the long Macro times, we need an additional ingredient. The basic 
concept is the quasi-equilibrium yield curve. In this chapter, the historical quasi- 
equilibrium yield curve used in Ch. 48, 49 is generalized. For the long Macro 
times, quasi-stochastic variables produce quasi-random quasi-equilibrium yield 
curves. This allows future interest-rate paths to spread out or fan out to achieve 
high or low values. The Macro-Micro model not only is in accord with the 
historical yield-curve dynamics, but it satisfies no-arbitrage properties that we 
investigate in Ch. 51. 


' History: This chapter is based on work done with Alan Beilis in Ref. i. Recent 
developments in the Macro-Micro model are covered in Ch. 51. The work was described 
in various talks, and in my CIFEr tutorial. 


? Acknowledgements: I thank Eric Slighton and Alain Nairay for helpful conversations. 


? Fat Tails Are an Extra Micro Component: Besides the Micro strong mean reverting 
effects, there are occasional large, quick jumps producing fat tails. In Ch. 49, we followed 
Ref ii, employing a phenomenological form for fat tails using a large-width Gaussian. See 
Ch. 21 that continues this idea. See also Ch. 46, which describes the nonlinear diffusion 
Reggeon Field Theory as a possible dynamical mechanism for fat tails. 
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In a sense, the statistical properties of the yield curve form physical 
constraints that are just as real as the universally accepted idea that zero-coupon 
bond prices should be reproduced by models. 

We propose an interpretation of the Macro slowly varying quasi-equilibrium 
paths. The interpretation is that the slow variation of the Macro paths is due to 
macroeconomic effects (Fed. actions, etc.). The small, rapid Micro fluctuations 
are proposed to be due to trading activities. These small Micro fluctuations 
closely follow the smooth Macro trends, reacting to market events. 


Introduction to this Chapter 


In the two previous chapters, empirical studies of yield curve movements using a 
number of statistical techniques were described. Several multivariate yield-curve 
models were examined, and two crucial points were made regarding models and 
data. First, the importance was emphasized of a slowly varying quasi-equilibrium 
yield-curve path in time defining the mean, (i.e. drift, trend, moving average) of 
the data over time scales of months to a few years. Second, yield-curve data were 
observed to behave as rapid movements tightly constrained by strong mean 
reversion about this slowly moving quasi-equilibrium yield curve. 

Specifically, a multivariate strongly mean-reverting Gaussian (SMRG) model 
was shown to provide good agreement with the statistical properties of the 
fluctuations of the data about a reasonably defined slowly-varying quasi- 
equilibrium mean yield-curve path. Phenomenologically, the quasi-equilibrium 
historical yield curve path was taken to be composed of simple line segments, 
defined by the data trends. The statistical probes included the usual correlations 
and volatilities, the absence of anomalous yield-curve shapes (kinks), spread 
volatilities, a sharp probe involving third-order correlation functions, and 
principal component (EOF) decompositions of yield-curve movements. 

Standard models without strong mean reversion, which produce interest rates 
that "fan out", surprisingly do not appear to be in accord with important features 
of data. The most telling disagreement is the presence of large kinks in yield 
curves produced by multifactor models without strong mean reversion. Other 
related problems with statistical measures are present with standard models. 

The success of this description of data using a multivariate strong mean- 
reverting model was, however, tempered with the realization that the use of such 
a model to price contingent claims was problematic. Contingent claims are 
currently priced mostly using one or two-factor interest rate diffusion models that 
do not have strong mean reversion (or any mean reversion), and therefore do 
indeed have interest rate paths that fan out. Hence, the strong mean-reversion 
model needed to fit past yield-curve data needs modification in order to describe 
future interest rate movements that can reasonably price options. 

We therefore want to modify the strong MRG model with two goals in mind: 
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1. Maintain the agreement with yield-curve dynamical data 
2. Price options with parameters that can be roughly reinterpreted as 
parameters of standard models. 


This chapter constructs a model that achieves these goals; we call it the 
Macro-Micro yield-curve model, realized as a Monte-Carlo simulator. The 
interpretation of the Macro-Micro model involves both macroeconomic effects 
and trading activities. 


Spectral Decomposition and Time Scales 


Essentially, we are breaking the spectrum of fluctuations up into low and high 
frequency regimes. The low frequencies are assumed due to macroeconomic 
effects; we call these "macro fluctuations". The high frequencies are assumed due 
to trading; we call these "micro fluctuations". Actually, there are probably an 
infinite number of time scales. The micro component is hardwired and 
determined to agree with yield-curve historical data. The micro parameters are 
therefore fixed. 

The idea of fundamental time scales is a new ingredient not present in the 
standard models used in pricing derivatives. We repeat that we are forced into 
this description by the yield-curve data, especially the dynamical shape statistics 
with the absence of kinks, as described in the last chapter. In our view, the usual 
models without explicit time scales that price options only provide approximate 
parameterizations of market data. The Macro-Micro model can reduce 
approximately to simpler models. 


The Macro Component and Macroeconomics 


The basic idea is that the slowly varying quasi-equilibrium means, around which 
fluctuations occur, are themselves stochastic, but only on long time scales. This 
stochastic process does not scale down to short time scales, and therefore is not 
Brownian. We call this behavior “quasi stochastic". In this model, a stochastic 
property with average time scales on the order of months to years is attributed to 
the smooth long time-scale effects on interest rates of these macroeconomic 
variables. 

We do not attempt to model these macroeconomic effects directly, as this is 
far beyond our abilities. Rather, we describe these effects in a phenomenological 
way, containing parameters. One of the parameters is a "macro-volatility", 
providing the quasi randomness, and accounting for the uncertainties in the long- 
term trends due to macroeconomic effects. This macro-volatility can be 
determined by (1) taking a view on the overall volatility of interest rates or (2) 
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using implied techniques, i.e. demanding that the model prices and market prices 
agree for options. 

In this model, slowly varying long-time-scale effects are due to inflation, Fed 
policy, third-world debt, etc., which produce trend-related interest-rate effects 
and are responsible for the smoothly varying quasi-equilibrium yield-curve path 
actually realized historically in the real world. Such effects would exist even in 
the absence of trading activities". These trends can lead very high (20%) and low 
(1%) rates, as well as inverted yield curves, etc. In our model, these 
characteristics are attained in a smooth fashion acting on time scales of months or 
greater. 

A given realization of these macroeconomic events leads to a given macro- 
path. Thus, in historical data, one realization existed to form the historical quasi- 
equilibrium yield curve path of the trends of rates. 

Figure 1 (at the end of the chapter) shows a collection of these macro-paths 
for one yield produced by the macro simulator. The macro simulator is described 
in Section II. 


The Micro Component and Short-Term Trading Activity 


Short time-scale fluctuations on the order of minutes to weeks are proposed as 
due to trading activities that react to individual market events while attempting to 
follow overall trends. These fluctuations mostly are rapid and small; in fact, 
traders do an excellent job of following the trends. As they say, “the trend is your 
friend". Data show that fluctuations are less than 100 bp on either side of a given 
trend, and this fact is roughly independent of time. That is, traders act the same 
"now" as they acted "five years ago”; the fluctuations are relatively stationary. 
The fluctuations about a given trend, or macro-path, constitute what we call 
"micro-paths". 

Figure 2 (at the end of the chapter) illustrates the micro-paths about one 
simulated macro-path for one of the eleven yields produced by the micro 
multifactor simulator. The micro part of the Macro-Micro model was described in 
Ch. 49 ". There, the fluctuations were described by a strong mean reversion about 
the quasi-equilibrium trends of the data [i.e. the macro path as realized 
historically], which did describe the yield-curve data. 


^ Traders in the Closet? A picturesque way to describe the macro paths is to imagine an 
alien world that eliminates trading activities by putting all the traders in a closet. The 
remaining rate changes would only be due to macro effects. A proxy for this perhaps 
rather interesting but impractical action is to look at a rate that is not traded, such as the 
Fed Funds target rate or the prime cash rate. See Fig. 3 at the end of the chapter. 
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Prototype: Prime (Macro) and Libor (Macro + Micro) 


Consider Fig. 3 showing Prime rate and 6M Libor rate data). The Prime rate, 
which is not traded, is a Macro variable. The Libor rates can be thought of as 
being composed of small rapid fluctuations (the Micro component) around a 
Macro path (the Prime rate minus a rather stable spread). The Prime rate is not 
unique; we could have exhibited the Fed Funds rate for example. 


Remainder of this Chapter 


The rest of this chapter is organized as follows. In Section II, we present details 
of the Macro-Micro yield-curve model. Section III contains a wrap-up. Appendix 
A contains some remarks on no-arbitrage conditions, scenario analysis, and yield- 
curve dynamics. The no-arbitrage discussion is continued and amplified in the 
next chapter, Ch. 51. 


Details of the Macro-Micro Yield-Curve Model 


In this section, we give details of the Macro-Micro model simulator. As described 
in the preceding section, this simulator pictorially consists of random tubes of 
yields. The construction of a given tube surrounding a given realized mean yield- 
curve path was shown in Ch. 49 using a strong mean-reverting Gaussian model, 
producing agreement with the statistics of yield-curve data. We retain this 
construction here. That is, the micro component is fixed. The part of the 
simulator left is the quasi-random nature of the means, the macro-path simulator. 

Our aim is pragmatic; we wish to construct a "reasonable" macro-path 
simulator without getting bogged down in an extremely difficult attempt to 
"derive" its properties from macroeconomic data. This new Macro component of 
the Macro-Micro model will therefore be phenomenological and contain 
parameters. These parameters can be constrained both by historical trend data of 
interest rates and/or by obtaining "implied" parameters obtained by equating 
model prices of contingent claims to market prices. 


The Importance of the Time Step in the Macro Simulator 


The basic feature of the macro-simulator is that the time-step, usually an 
irrelevant variable that in practice is made as small as possible, must instead play 
a central role. This is because we do not want to allow the macro-fluctuations to 
scale down to the same time scales as the micro-fluctuations. If that were the 
case, nothing new would have been accomplished. The macro-fluctuations are 


5 Acknowledgements: I thank Citigroup for the use of these data, which are the same as 
those shown in my CIFEr tutorial. 
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supposed, after all, to describe the smooth trends or quasi-equilibrium yield-curve 
paths around which the micro-fluctuations occur. 

The macro simulator is based on the observations that a reasonable 
parameterization of trends consists of simple straight-line fits over different time 
periods that exhibit upward or downward movements of interest rates. These time 
steps have different lengths, and the slopes of the trend lines are different from 
one period to the next. 

While in principle this description could be maturity-dependent, we choose to 
take it independent of maturity for simplicity. 


Model for the Random Macro Time Step Dynamics 


We model the (positive) macro time step At over which a given straight- 


macro ? 


line trend occurs, as a random variable. The probability density function 
el АТ ond ] for macro time steps is taken as a cutoff, displaced lognormal 


distribution. The distribution has a cutoff 7 below which time-step values are 


cutoff 
not permitted, on the order of one month. This marks the transition into the Micro 


The macro time volatility oc? 


macro 


time regime. There is a central value 7 


macro* 


describes the uncertainty in the macro time intervals. 
Explicitly, for Аб ме > Телеу > 0, we write 


In 2 AL isi Е T cutoff 
1 T macro B T cutoff 
fo Eom ] — exp (т А (50. 1) 
! ) 2 
o Ж 2л 2 [ O macro ] 


Dynamical Model for the Random Macro Slope or Drift 


Over a given time interval with length At expressed in months, we choose a 


macro 


slope for the interest rate trends, which we call А. This parameter has units of 
measures a macro change in rates over time At 


macro* 


bp/yr per month, so AAt 


A simple Gaussian probability density function (o [ A | is both rational? and 


useful. We write 


* Brownian Limit and the Gaussian Slope Distribution: In the limit as the cutoff time 
goes to zero Teutoff — 0, the Gaussian assumption for the slope distribution reproduces 
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e[4]-— exp | -(4- À, ) (2o; )| (50.2) 


0,N2z 


Here, A, is an average slope parameter and с, the slope volatility. A, will 
play a significant role in no-arbitrage considerations, as we discuss in Ch. 51. 
Note trend slopes can be either positive or negative". 


The Macro Volatilities for Times and Slopes 


The two overall macro-volatilities used in the model аге с). for macro time 


macro 
intervals, and с, for interest-rate slopes. These parameters can be derived by 


considering the shape of the envelope of macro-paths for a given rate, and 
approximately fitting the width of the envelope to that given by a standard model. 

For example, in Fig. 1 the effective Gaussian macro-volatility corresponding 
to the macro-paths generated by the simulator at ten years is approximately 


Ow. Gaussian © 100 bp / ( yr) ^. This is on the order of implied-volatility values 


needed to correctly price options using a simple Gaussian rate model with no 
mean reversion. 

More recently, Singular Spectrum Analysis is being used to obtain the 
volatility parameters for the Macro component. 


Weak Mean Reversion and the Connection to Standard Models 


The macro envelope for a given rate can also be fit with a macro weakly mean- 
reverting one-factor model. 

The reader should carefully note that the macro weak mean reversion we are 
now talking about is completely different from and independent of the strong 
mean reversion for the micro component. The effective mean reversion for the 
macro envelope will be much smaller than the very strong mean reversion needed 
to constrain the tightly fluctuating micro-paths about each macro-path. 


Brownian motion, as explained in previous chapters on path integrals. Therefore, the 
Gaussian slope distribution is a natural assumption with finite time scales. 


7 Avoiding Negative Rates: In practice, we first pick the macro time step and then pick 
the macro slope. If the macro slope is negative enough so that the macro rate happens to 
approach zero, we replace the macro time step by a smaller number to avoid negative 
rates. This constraint makes the macro time-step distribution more complicated than the 
formula in the text. If negative rates are allowed to some minimum cutoff value, the 
procedure must be appropriately modified. 
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In this sense, the strong mean reversion needed to describe yield curve data 
coupled with the macro quasi-random component can be made consistent with 
the weak (or zero) mean reversion used in standard models. 


The Overall Average Macro Drift and No Arbitrage 
The set of paths can include an overall average yield-curve path R,,,(t,T ) 


about which all macro and micro fluctuations occur. If this is done, the average 
slope parameter А, is not needed and so can be set to zero. 


For no arbitrage, К, (t "d ) = Rua (t at ) is composed of the forward 


yields f (s un) determined by today's í, yield curve, up to convexity 


. 8 . . Й 
corrections’. These convexity corrections were first determined for specific 
iii 


model assumptions for multifactor models by Heath, Jarrow and Morton". 
Further discussion is in Appendix A and in the next chapter. 


Option Pricing Using the Macro-Micro Model 

European options, for example, can be priced. The discounting from expiry can 
be done using the average model spot-rate curve or in the case of interest-rate 
options, discounting back along the short-rate paths one at a time; the latter 
method follows the conventional approach. 

We have checked that European option prices using the Macro-Micro 
simulator, with appropriate parameters, are in reasonable accord with option 
prices produced by conventional one-factor models. 

Actually, the Macro-Micro model is a risk management framework. The fact 
that options can be approximately priced tells us that the model 1s not so far from 
a no-arbitrage framework. 


Developments in the Macro-Micro Model 


Various developments in the Macro-Micro Model are described in the next 
chapter. 


Wrap-Up of this Chapter 


We have presented a new multifactor yield-curve "Macro-Micro" model. This 
model can be used for pricing and risk-management purposes, where the need for 


* Coupons: Actually, the phenomenology described in this part of the book was done 
directly with coupon bonds, for which we had the data . The more usual description is in 
terms of zero-coupon bonds. 
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consistency with the data for the statistics of the shape of the yield curve can 
make a significant difference. 

Two very different sorts of dynamical variables are included. Yield-curve 
Macro-paths are generated by a quasi-stochastic process that does not scale down 
to small time scales, but rather generates smooth trends in interest rates over 
months to years, the quasi-equilibrium macro-path yield curves. About these 
smooth yield-curve macro paths, the yield-curve micro paths exhibit small rapid 
fluctuations. Occasional jumps also occur. This description is consistent with 
historical yield-curve statistics, as shown in the last chapter. 

We proposed a physical interpretation of the origins of the macro and micro 
variables. Macroeconomic effects (themselves uncertain) produce the smooth 
long-term trends. Trading activities, responding to individual market events and 
following a given realized macro trend, generate the rapid fluctuations. The 
macro component contains several parameters that can be determined by 
historical data, or as implied parameters by pricing options. 


(t, T) that 


can be specified by no-arbitrage arguments for pricing options. The trading 
activities of the Micro component can be described as attempting to profit by 
arbitrage while moving the market toward a local state of no-arbitrage. On the 
other hand, the Macro component due to macroeconomic effects on interest rates 
would seem to have nothing to do with trading arbitrage or the lack of it. Still, we 
are able to incorporate no-arbitrage constraints, as discussed further in Ch. 51. 

Because standard models serve well in market situations to price options, we 
do not suggest that our complicated simulator be used for this purpose. On the 
other hand, the Macro-Micro model can be re-expressed approximately in terms 
of standard models for pricing options. For example, the Macro-Micro 
multivariate yield-curve simulator, or a projected version of it using only a short 
and long rate, can be used to price mortgage-backed securities products. The 
micro-fluctuations are guaranteed to produce accurate short-rate/long-rate spread 
statistics. Thus, the long-rate input into a prepayment model will have a 
historically-consistent relation with the short discounting rate. 

Still, contingent claims are just weak probes of the real underlying interest- 
rate process. A model agreeing with the dynamics of yield curves provides a 
better tool for interest-rate-risk analysis. 


All fluctuations (slow macro and rapid micro) occur around К, 


Appendix A. No Arbitrage and Yield-Curve Dynamics 


We have some general remarks. Formal properties of no-arbitrage, yield-curve 

dynamical properties, and the Macro-Micro Model are derived in Ch. 51. 

e No-arbitrage constraints are universally used in pricing contingent claims. 
Still no guarantee is given that the historical statistical properties of yields 
will be reproduced, even with historical volatility and correlation input. 
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e Тһе statistical properties of the yield curve form physical constraints that are 
just as real as the universally accepted idea that zero-coupon bond prices 
should be reproduced by models. 

e The statistics of the shapes of yield curves predicted by the standard- 
multifactor-type models we investigated without very strong mean reversion 
do not appear to reproduce the statistics of yield-curve data. 

e Disagreement with yield-curve dynamics is important since that can lead to 
model predictions of statistical arbitrage that are not present in the market. 
This is because statistical properties of yields in the data but not in the model 
can lead to portfolios that the model will statistically misprice. 

e Asan example, we have models that produce kinks in the yield curve that are 
not observed. Such models will misprice options that are sensitive to yield- 
curve shape. 

e Standard models produce market prices for options by adjustment of implied 
volatility. However, options are only "weak probes" of the underlying 
interest-rate process; i.e. the details of the actual interest-rate process are not 
tested by options. 

e No-arbitrage constraints are imposed by determination of the yield-curve 


path Кү, (t 3d ) , about which all fluctuations occur. For the short rate, this 
function is chosen such that the average discount factors over all short-rate 
paths agree with initial yield-curve data. In general, R,,, (t "i ) up to a 


convexity correction is the forward yield f (t ,T И) of maturity T at 


forward time f , as obtained from today's yield curve. 

e  No-arbitrage implies the absence of a higher total return of one portfolio of 
bonds over another. Even without imposing the no-arbitrage drifts, we have 
checked that portfolio returns are roughly constant. That is, no arbitrage is 
approximately true even for the Monte-Carlo paths used in the analysis of the 


historical data, i.e. А, (t Г ) as determined historically. This is consistent 


with the historical yield-curve data not containing much arbitrage. 


Scenario Analysis 


Scenario analysis is a different issue. Instead of following the forward rates 
moving forward in time, portfolio managers often want to know what will happen 
to their portfolios under scenario viewpoints on yield-curve changes. In this case 


К (ї ‚Т ) = Ry. ано (t ‚Т ) сап be chosen as the preferred yield-curve 
scenario. The strong mean-reverting fluctuations of the Micro component would 
exist around this preferred scenario. Since risk managers in general do not like to 


postulate scenarios far out into the future, the scenario can be used out to a 
horizon time, after which time the no-arbitrage yield curve can be used. The 
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maturity T - dependence can be chosen to produce inverted or non-inverted 
average yield curves at future time t. 


Alternatively, no-arbitrage can be used out to the horizon time, and a preferred 
scenario can be used after that time. 


Figures: Macro-Micro Model 


Fig. [1]. Macro quasi-equilibrium yield-curve paths in time for fixed maturity 
T produced by the macro simulator relative to the initial value, 
(t,7)—r(0,7). The quasi-random fluctuations in the macro paths exist 


T nacro 
over long times. They are imagined due to uncertainties in macroeconomic 
variables. These macro fluctuations can be defined around an overall average 


path R,,, (t ‚Т ) chosen to satisfy no-arbitrage term-structure constraints. This 


is taken as r (0,7 ) for simplicity in the figure. In the model, one macro-path 


realization is picked out of all possibilities as the smoothly varying quasi- 
equilibrium path of historical data. 


Fig. [2]. Micro paths in time for a rate r (1,7 ) at fixed T forming a 


"tube" of small rapid highly mean-reverting fluctuations around a Macro model 
path. The dynamics for Micro-path fluctuations are described by the strong mean- 
reverting Gaussian process. These restricted fluctuations agree with historical 
data for yield-curve fluctuations. 


Fig. [3]. Data for the Prime rate and 6-month Libor. The Prime rate is 
macro-economically determined, and varies along a Macro path. The Libor rate 
contains Macro effects roughly following the Prime rate up to a spread, plus 
small fluctuating Micro effects. Another Macro rate is the Fed Funds target rate. 
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51. Macro-Micro Model: Further Developments 
(Tech. Index 6/10) 


Summary of This Chapter 


We consider various developments in the Macro-Micro (MM) model since the 
original 1989 paper '. We believe that these results are quite encouraging and 
urge others to consider performing research and analysis. Here is the outline: 


Using SSA to determine the Macro Component 

Intuition: Short to Long Times - Volatility, No-Arbitrage 
The Macro-Micro Model Applied to FX and Equity Markets 
Formal Developments in the Macro Micro Model 

No Arbitrage and the Macro-Micro Model: Formal Aspects 
Hedging, Forward Prices, No Arbitrage, Options (Equities) 
Satisfying the Interest-Rate Term-Structure Constraints 
Chaos and the Macro-Micro Model 

Technical Analysis and the MM Model 

The Macro-Micro Model and Data 

Finance Models Related to the Macro-Micro Model 
Economics Models Related to the Macro-Micro Model 


Using SSA to determine the Macro Component 


SSA as described in previous chapters is capable of determining trend behavior 
(the Macro component) and eliminating noise (the Micro component). Two 
interesting ideas are": 


1. Use SSA to provide empirical parameters as input to the original model 
for the macro component with Gaussian-distributed slopes or trends. 
Here is the procedure. First smooth a given time series with SSA. For a 
given time interval, determine the distribution of slopes or trends for that 
time series. Approximate the slope distribution with a Gaussian. This 
gives an empirically determined mean and standard deviation of the 


829 
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slope distribution. Then draw randomly from this slope distribution. The 
overall mean can be fixed externally or approximate no-arbitrage 
considerations can be used, as described later in this chapter. For 
multiple variables, use the SSA-smoothed correlations, as described in 
Ch. 36. The time intervals can be chosen randomly or fit to the time 
intervals in a simulation. 

2. Fora given time interval length, with definite start and end dates, use the 
SSA fit for each variable within that time interval. Randomly sample the 
time interval start times, with all time intervals having given length. This 
method automatically preserves the SSA-based correlations up to 
statistical error because all variables are sampled within the same time 
interval length’. The time lengths can also be randomly sampled, as in 
the original Macro model. The minimum time length will affect the 
correlations generated in the simulator. 


Intuition: Short to Long Times - Volatility, No-Arbitrage 


The Macro-Micro model acts differently for short, intermediate, and long times. 


Volatility in the Micro-Macro Mocel 
First consider volatility, and focus on one macro path pr us starting from time 


zero, with micro paths moving around p? : . At short times, the drift of P? 


ro macro 
from the macro simulator is one number, constant. In that case, the model 
volatility is just the volatility of the mean-reverting Micro model. At intermediate 


times moving along P m the drift changes a few times. The macro volatility 
starts to mix with the macro volatility. At long times moving along po А , the 
macro 


drift changes many times. Assuming the probability distribution for the time 
intervals (9 LM is small unless At eo SA (cf. Ch. 50) for some 


(max) 
macro 


At" For long times relative to At 


аня , the volatility is mostly the Macro 
volatility with some Micro component. 
In this way the Macro-Micro model volatility derives time dependence from 


the dynamics, even if both micro and macro volatilities are assumed constant. 


' Correlation simulator: In fact, the simulator will vary the correlations around the 
empirical values, which automatically creates a sort of correlation simulator. 
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No-Arbitrage and Arbitrage in the Macro-Micro model 


Consider no-arbitrage and arbitrage as a function of time. At short times, the 
Micro model is relevant and the no-arbitrage conditions described above are 
relatively simple, although more complicated than a single drift model. At 
intermediate times, the Macro and Micro components mix, and some more 
arbitrage exists. At long times relative to А the model no-arbitrage 


conditions are again relatively simple. 
Below we examine no-arbitrage and arbitrage in more detail. 


The Macro-Micro Model applied to FX and Equity Markets 


The Macro-Micro model was originally formulated for interest rates, as described 
in preceding chapters. The origin of the model was the desire to describe the 
statistics of yield-curve data including the absence of kinks, and to be able to 
price contingent claims. 


Review: The CDA Test and the Yield Curve 


As described in Ch. 50, a sharp statistical tool enabling us to understand yield- 
curve dynamics is the 3"-order generalized skewness Green function in the 
Cluster Decomposition Analysis (CDA). This quantity vanishes for a Gaussian 


С; Gaussian) 


measure, i.e. = 0. The application of the CDA proceeds in two steps. 


First, using algebra we isolate the assumed Gaussian measure dz(t) occurring in 


the stochastic equation of a putative model. Then we use the data to test whether 
or not C,/M, = 0 for this supposed Gaussian dz(t), and thereby test the 


model’s validity directly using the data. Here M, is used for normalization 
purposes, and is nonzero for nonzero drift (which always occurs in data). If the 
requirement C,/M, = 0 is satisfied, this test is consistent with the data. If this 
requirement is not satisfied, e.g. if C, /M, >> 0 for dz(t), then the model is 


inconsistent with the data. 

For the yield curve, we found that if strong mean reversion about a quasi- 
equilibrium moving average is used, then C, /M, = 0 is satisfied. This enabled 
us to develop a strong mean-reverting MC simulator that produced yield curves 
that look like real data, in addition to passing a battery of statistical tests. 
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Equities, FX, Strong Mean Reversion, and the CDA Test 
We have carried out similar preliminary analyses for both the FX and equity 
markets, and find results similar to the yield-curve case. The C,/M, ~0 


requirement is satisfied only if strong mean reversion is assumed for FX and 
equities, similar to the interest rate case. This provides some evidence that strong 
mean reverting fluctuations about a quasi-equilibrium mean also operates in the 
FX and equity markets’. 


Probability Analyses for FX and the Macro-Micro Model 


Even a casual look at FX data over many years presents striking behavior of long 
successive periods of time during which distinct long-term trends were present, 
and about which small oscillations occur. Such behavior is natural and expected 
with the Macro-Micro model with quasi-random slopes and strong mean- 
reverting oscillations. Conversely, this observed behavior is unnatural and highly 
unlikely in standard models. We now consider some details explicitly. 

In my CIFER tutorial", some probability calculations were presented that 
indirectly imply the existence of the Macro-Micro model for FX. These 
calculations were based on a simple analysis of the standard no-arbitrage FX 
model. 

The standard no-arbitrage FX model was described in Ch. 5. It says that 
relative changes of an exchange rate fluctuate about a drift given by interest-rate 
parity. Contrary to the equity market where the no-arbitrage drift is determined 
by portfolio arguments and not supposed to have much to do with the actual 
behavior of equities, the FX drift is a physically motivated quantity directly tied 
to FX forwards, and these are transacted in the market. To the extent that the 
actual spot FX time dependence behavior is correlated with that predicted by FX 
forwards, the market spot FX time behavior should at least approximately exhibit 
the behavior of the no-arbitrage FX model. Therefore, we can use the no- 
arbitrage drift of FX to gain some physical insight. 

The calculation performed was a straightforward check of how probable the 
observed market behavior on the average over the long term (1972-1999) could 
be described by the no-arbitrage FX model. A variety of time series was 
considered, all with similar results. These are described next. 


Long-Term FX Data: Probability Standard Model Holds is 10^(-8) 
For the case of n| DEM / USD]. four distinct regions occurred for the average 


behavior from 1972-1999. Qualitatively, these regions exhibited: (I) Decrease 
1972-1980, (II) Increase 1980-1985, (III) Decrease 1985-1988, and (IV) Flat 
1988-1999, The total amounts of movement in the first three regions were very 


? Trading with the MM Model: I know one equity trading shop that uses a strategy 
essentially equivalent to the MM Model. 
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large, making it quite improbable that the observed behavior could be explained 
by the standard model. 

The probability that the behavior observed in the first three regions can be 
explained by the standard no-arbitrage FX model can be evaluated using barrier 
probabilities’. The probability that the observed decrease occurred in region I is a 


Down-In probability. Numerically, this probability is 5%, peo) _ 9.050. 


Region І 


Region II involves an Up-In probability of 0.01%, pœ) = 0.0001. Region III 


Region II 


involves another Down-In probability of 0.1%, plow) = 0.001. All these 


Region III 
probabilities are small. 
The composite probability is the product of these three probabilities, 


pw) = o(10*) . This small value is the probability that long-term FX rate 


All Regions 


behavior can be explained by the standard FX no-arbitrage drift plus Brownian 
noise model. To put this in human terms, 10? is the probability of correctly 
picking out one pre-specified second of time in three years. 


Formal Developments in the Macro-Micro Model 


In this section, we deal with some formal developments in the Macro-Micro 
model. This includes a discussion of hedging, consistency with forward 
quantities, term-structure constraints, and no-arbitrage. 

In general, we will have a collection of macro parameters that we generically 


call TA . In Ch. 50, we considered quasi-random slopes or trends over random 


times above a cutoff time. In the next chapter, we will introduce a set of 
“Toolkit” functions with several parameters that might serve to describe quasi- 
random cycles. For purposes of this chapter, we keep the idea general. 


It is important first to consider fixed specific values of the parameters LM 


and then to perform the averaging over their various possible values. A specific 
deterministic function (1.e. not random) will be used to enforce constraints. 


The Green Function with a Specific Quasi-Random Drift 
For fixed specific values of the macro parameters TAI the quasi-equilibrium 
quasi-random drift u(t{4p}] is deterministic. Therefore, for fixed TAS the 


usual convolution theorems hold for Gaussian propagators in x(t). We therefore 


3 Parameters: We used volatilities and rates current as of the time of the data to evaluate 
the probabilities. For a discussion of barrier probabilities, see Ch. 17. 
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get the Green function G,, ((45]) over time Ty = r —t, from x, to x with 
the fixed parameters {A,} directly in the usual form. Here the subscripts on 
Gys ({4 в }) indicate the spatial апа time dependences of the fully notated Green 
function G(x, X sl wt: {4 в Le We obtain (including discounting) 


Gy ({4, ja E Ger" 


exp| o, ((2,]) la (51.1) 


2 
LIC ed ne 


Неге Ọ ((45]) = |» — Xo — klos 4a | к The total macro 


drift Lo (fa в }) for fixed LA is the usual time-averaged expression, 


д.(12,))= z- [pulo (51.2) 


Averaging the Green Function over Macro Parameters {A a 

To describe future paths, we naturally do not know which macro parameters 
{4} to use. Call e| 4; |] the probability density function of the Ape This 
generalizes the e| А | probability function of the slope in Ch. 50. The 


associated measure of e| {Ap} | is DI, = el {ap} TI44 where the 
B 


normalization is f ө|{2, Па, =1. Неге Set, is the set of possible 
B 


{45 }cSet, 
values of the various parameters Ust. We keep Set, independent of time for 


simplicity, although the theory can be generalized to make Set, depend on time. 
For option pricing, we need the expectation over all variables; hence, we 


want to average over the macro parameters {Ag}. Therefore we multiply 
Gs ((45]) by e|(4,]] and integrate over the possible values of the {Ag}. 


В Y Avg (AS 
We call the resulting expression Со! where 
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Get = [ e([4] olay} TT 44, (51.3) 


E 


Options and the Macro Parameter - Averaged Green Function 


A 


The quantity С^ averaged over the {Ag} parameters as just defined is the 


Green function needed for performing discounted expected values of cash flows 
in pricing for equities, FX, and commodities options. For a European option 


valued today t, with payout C (s Ps at expiration t , we merely commute the 


integrals* over LA and x . We obtain 


Си) = Cos (5,.4,) = 1 Gas | Jar (51.4) 


—o0 


To repeat, this is just the usual option expression averaged over the {Ag} macro 


parameters. 


No Arbitrage and the Macro-Micro Model: Formal Aspects 


This section discusses the formal properties of no-arbitrage conditions in the 
presence of quasi-equilibrium drifts with quasi-random movements including 
minimum time scales. We discuss both the equity market and the interest-rate 
market. 

We will see that forward quantities can be matched. An important result is the 
proof that short-term hedging arguments remain intact. 

Naturally, some changes occur in some quantities, as we can expect, since we 
have extra dynamical variables relative to standard models. These include an 
extra term in an effective diffusion equation and an extra term in the no-arbitrage 
formalism. 


First, we will define the general expectation (X ) , of any quantity with 


respect to {Ag \ А 


* Commutative Restrictions. As mentioned in the section on stochastic volatility in Ch. 
6, App. B, the integrals may not commute unless appropriate restrictions are placed on 
the parameters. 
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(as f xie) П, (51.5) 


For example, writing 44 ((2,)) = u(t,-{49}) at time ¢, , we have 
a) | Li eta I, (51.6) 


Also, note that С"! = (Gos ) , as above. 


The (Drift, Green Function) Correlation ( LA, Gs, ) a 
The correlation function ( A, GG, ) us will turn out to be important. This quantity 


is defined as 
(HG ж = (д.с), – (и, A (Ga he (51.7) 


The subscripts a,b are shorthand for (x, ,t, ) and (x, sty ). We have 


(А.С, Jue E | д, ({4,})- 22" |-, ({4,})е|{5%} П, 


{AphcSety, B 
(51.8) 


The Macro Parameter - Averaged Diffusion Equation 
Again, assume that the variable x is Gaussian. With a fixed set of parameters 


{Ay}, the Green function G,,({2,})=G(x,.%53t..63{4p |) satisfies the 
standard diffusion equation with drift u(t,.{4}). Here the (backward) 
diffusion operator is with respect to (x,t, ). This is because G,, ({4;}) has the 
canonical Gaussian form for fixed (А, |. 


We average this diffusion equation with respect to the {4 в \ . We obtain 
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—+10°(t,) e + ax 0н) Gre = 
at, 7 "ap^ l a ud is 


a a 


(51.9) 


The term on the right-hand side is the x,-derivative of the correlation function 
( LA, Gar ) dre Again, the average is over the macro parameters {Ag}. This is an 


extra term not present in the standard formulation 


Hedging, Forward Prices, No Arbitrage, Options (Equities) 


In this section, we deal with hedging, forward prices, and no arbitrage for equity 
(and FX, commodity etc.) options in the presence of the quasi-random drifts 


specified by variation of the LM parameters. The standard delta hedging 
procedure goes through. We introduce a deterministic average slope function 


А (ї) that can be used to enforce the forward stock price. The no-arbitrage 
equation picks up an extra term containing the correlation ( UG s ) 2:55 


Hedging at Fixed {A,} Parameters 


The usual hedging arguments go straight through for fixed parameters {А„}. 
This is because the drift for fixed {A,} is deterministic. We consider the usual 
portfolio V(s ,t) = №“ +С (5 2) of one option plus stock. The portfolio value 
V, at time f, changes to И at time і =¢,+dt,. Calling the drift 
[A (fa) = EE }) , we get the standard result at fixed {4,} ; 


OC 
ИУ =|N,+— 
1 0 | S Os 


0 


ks Sy) Lo ((25]) #408 ]5, 5а, + Cue 


0 


(51.10) 


The standard hedging prescription — N, = А = ОС, / OS, eliminates the 
stochastic term proportional to (5, — S, ) at fixed {2,}. 
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Hedging Including Averaging over the {A у Parameters 
Call p 456 (S.t) = N SEC (5,2) , the {A,} - averaged portfolio at time 
t. Call Ш" the averaged drift at time t, . We get directly 


" vids оС 
укш) _ у кш -|% + 25; (S, -S,) 
оС! 
-| ws" +405 |5, ue Са, (51.11) 


The hedging prescription that eliminates the stochastic term proportional to 
(S, —S,) is just А һе LA - averaged expression for A, namely 


Avg {A} 
С; 


-N = AA — 
2 28, 


(51.12) 


This produces the usual hedging prescription exactly, since C; ^ is the final 


model option value at f, after averaging. 
The bottom line is that the standard hedging prescription goes through in the 
presence of {Ag} averaging. 


The Forward Stock Price is the Standard Expression 
The forward stock price at f is obtained by writing down the standard 


expression at fixed {Ag} and then averaging over 24. We take the 


prescription fixing the averaged- {4 " drift for 7). = P fy as 


(exp| д, (Жы = exp] (r -10h )n- | (51.13) 
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This can be done using a point-by-point specification of the deterministic 
function А, (t) discussed before. The above condition reduces to 


LESU = -03/2 when Ty. — dt, becomes infinitesimal. 


We then obtain the usual expression for the {4, | -averaged forward stock 
price S Ps Cue (č ) at time f , viz (ignoring dividends), 


* 


$259 =й exp [r(e)ae (51.14) 


to 


Modified No Arbitrage Condition for Equity Options 
dV 
The no-arbitrage condition FS = nV could have been produced at fixed { A ‚| by 
0 
setting LU, ({4, j equal to the usual expression 7, — 0; / 2. However, we do not 


want to do this because we would have to impose this constraint for every set 
{A,} . Thus, we would lose the {A,} variability of the drift, which is the whole 


point of the Macro component. For this reason, we consider no-arbitrage only 


Avg {A} 


after averaging over {A,} by writing 4o =h -0, / 2. Because options in 


this framework are averaged over { A is) , this approach is consistent. 


We need to specify the drift a bit further. We refer the reader to Ch. 50 on the 
Macro-Micro model, where a distribution of slopes was used to describe the 
drifts. The distribution of slopes has an average slope. We take this average slope 


А (ї) as a deterministic function of time. The averaging over quasi-random 


variations of the {Ag} parameters will not include variation of this deterministic 


function 4, (t) . We then write the drift u(t, { 4, |) in the form 


u(t(4,]) - A, (0) + 5u(t, {Ap }) (51.15) 


We define óu ^* (i) via (4; | -averaging of àu(t.(4; j at time £j. We 
choose 4, (to) = —óu^* "' (f, )*(n - 01/2) so д! =r 0212. We 


can choose 7, as the standard risk free rate. With this done, we get 
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Q T * * 
Avg {4 Ауе {Ар _ Avg (À * 
y ^et раша y eret gy - = | (46), C(S’,t')dx” (5116) 
0 о 


This is as close as we can get to the standard no-arbitrage condition. The second 


term with the integral of the correlation function ( Lo Go) is an extra piece. 


АС 


The Modified Black-Scholes Equity Option Formula 
Consider a European call option. We first carry out the usual integral of the 


payoff over the terminal co-ordinate x and afterward perform the {Ag} 


exp| He (125])7» | 
(exp[ ae (14 DT), 


lead to a modified Black-Scholes formula with {4 й) averaging, 


averaging. Define clf |) = . Standard manipulations 


Coe = (8, -¢({a})-N[ a, {a} ]- Ee  -N[a (4]]), (5117) 


Here d, (fa в }) are ће usual expressions but containing the drift 44. ({4, |) | 


Satisfying the Interest-Rate Term-Structure Constraints 


For term-structure constraints for interest rate dynamics, we need consistency 
with the zero-coupon bond prices given by the data at time /,. The zero-coupon 


bond at f, paying one at maturity date a (time to maturity 7). ) must equal the 


x -spatial integral at t of the Green function GEE ® including discounting, viz 
D e f Gas gy (51.18) 


As in the above sections on equity options, we utilize the deterministic slope 
function 4, (t), writing the drift as u(t{4p}] =A,, (0) + du(z,{4,}). We 


specify A, (t) to satisfy the term-structure constraints in the presence of {Ag} 
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averaging over the rest of the quasi-random drift function. This idea was already 
present in the original Macro-Micro paper in 1989. 


t 
The idea is to define А, (t) = 1 А (ї ') dt', which acts like an interest rate. 
10 
р 
We then determine exp| — f R Jay (t)dt . This quantity is proportional to the t- 
to 
maturity zero-coupon bond. The proportionality constant depends on the model. 
We then get А, (t) = dR А, (ї) | dt . There are enough degrees of freedom in this 


arbitrary function (actually an infinite number) so that arbitrary term structure 
constraints can be incorporated. 
For the case of the mean-reverting Gaussian model’, for example, the 


Tos) Avg {2} (r,t) 


proportionality constant is у Ж where 


ж 


Pav (nt) = Kye exp J di [ours (2, ))ar (51.19) 


fo o À 


2 
Calling œ the mean reversion, К = exp Е —g "he ) + a J (aT). ] : 
@ 


Here, z (at) ир!” 1) (n )| The starting value 
OT 


is xy = 0. The discount factor е lv is equivalent to integrating the short-rate 


discount factors over time Ta 


Chaos and the Macro-Micro Model 


In Ch. 46, we consider chaos-like models, including the Reggeon Field Theory. 
These are scaling models without any explicit time scales, and so they are very 
different from the Macro-Micro model that has explicit time scales. While we are 
convinced that chaos-like models have a place in describing rare fat-tail jumps, 
we are equally convinced that the bulk of the data requires the presence of time 
scales not present in chaos models. 


5 Mean-Reverting Gaussian Model: See Ch. 43 on path integrals and options. 
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Data, the Macro-Micro Model, and Chaos 


As we saw in detail in Ch. 49, the description of yield-curve statistical data was 
possible with a strong mean-reverting model. Conversely, the data were 
problematic for other models to describe. The presence of distinct time scales 
seemed unavoidable. It is a challenge to chaos models without distinct time 
scales to fit the details of the yield-curve data. 


The Econometric Model for FX Crises and Chaos 


The econometric Kaminsky-Reinhart/Omarova FX Crisis model, described 
below, has important significance for chaos models. Explicit long time scales of 
one to two years are used in the analysis for the crisis windows. Such 
distinguished time scales are not present in chaos scaling theories that by 
definition are supposed to work at all time scales. 


Anomalous Dimensions, Chaos, and the Macro-Micro Model 
The Macro-Micro model can numerically mimic anomalous dimensions. 


Anomalous dimensions represent deviations from square-root time scaling Jt ; 


ie. ff with, CSM Re L9 5. 


Deviation from vt scaling is often associated with chaos-like behavior. The 
point here is that the very different underlying dynamics of the Macro-Micro 
model can mimic this chaos-like behavior without any such scaling properties in 
the model. Contrariwise, the Macro-Micro model contains explicit time scales. 

We can explicitly construct an “effective” time-dependent volatility for the 
Macro-Micro model with a mean-reverting Micro component and the quasi- 
random slope Macro component. For simplicity, we model one variable. 


Specifically, we can write с as the effective total volatility in the 


K 


K" macro time interval of length 7 due to both macro and micro effects. 


Kl 


Denote the micro volatility by оу °° and the macro slope volatility by 
ст “°З Ре Using standard convolution arguments, we get the total variance in 


the x” period as 


; 2 ; 2 2 
MicroMacro Micro MacroSlope 2 
(с ) A (c ) Todi * (o; ) T (51.20) 


K K,K+1 


The second term is due to the uncertainty of the macro slope and induces a time 
MicroMacro 
K 


dependence in с . The figure below shows an example. 
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Macro-Micro Vol vs Time^Power (power = 0.6) 


—1— T “Power —e— Tot Micro Vol 
—s— Tot Macro Vol  —e— Місго-Масго Vol 


Vol 


Here we took the micro volatility as given by a mean-reverting Gaussian 
model and took equal step lengths т When this effective time-dependent 


к,к+і` 


MicroMacro 


volatility o. is fit with a power law t° in time, we find that the power is 
different from one half, e.g. ¢ = 0.6. 

To show the versatility of the Macro-Micro model in producing anomalous 
dimensions, consider the second figure below. It shows a two-component 
structure for the macro vol, but this time with no micro vol. There are two step 
lengths chosen at random with equal probabilities. The example has been 
constructed to give the same anomalous power behavior, б « 0.6 . 
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Two-Component Macro Vol vs Time^Power 0.6 


—в— MacroVol —4— Time^power 


Vol 


Technical Analysis and the MM Model 


For convenience of the reader, we give a lightning summary of technical 
analysis'”°. Technical analysis looks for patterns directly in the time-series data 
and employs a variety of measuring probes. The most basic are support and 
resistance levels. Support and resistance levels are defined using the closing price 
level plus or minus a volatility estimate based on the recent trading price range. 
Patterns in historical price data are classified with various pictorial expressions 
like “head and shoulders”. Trend indicators, for example different moving 
averages, are employed. Different moving averages are used and compared with 
each other, with special attention to crossing points. Time-differenced series are 
used to generate “momentum” indicators. Cyclic behavior is also considered, 
from seasonal behavior with a physical basis (e.g. weather) to abstract schemes 
(e.g. “Elliott Waves” involving Fibonacci numbers). Besides prices, other 
variables like volume and open interest are considered. Concordance between 
indications of some subset of these probes is used as input to trading decisions. 
The usefulness of technical analysis is a subject of disagreement. Some 
technical indicators are simple versions of volatility measures used in standard 
risk management. The relation to trading patterns to technical analysis is not 
clear. Regardless, many traders and systems use technical analysis. Perhaps 


* Acknowledgements: I thank Rick Sheffield and John Pisanchik for informative 
conversations regarding technical analysis. 
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technical analysis acquires some numerical validity in a self-fulfilling prophecy 


through feedback of trader activities using technical analysis to the market. 


Connection of Technical Analysis with the Macro-Micro Model 


The obvious conjecture is simple and physical: 
e Moving averages in technical analysis correspond to the Macro component, 
when taken over macro time scales. 


The short-term fluctuations around the moving averages in technical analysis 
correspond to the Micro component. 


The Macro-Micro Model and Data 


We present here some further indications for long time scales. We used the 
interest rate data in Ref. v between 1950 and 1996, i.e. 46 years. Without doing 
any fits, we generated MC paths using the Macro model. That is, in distinction to 
the previous chapters, we did not fix the Macro drifts using data. 

Consider the following graph giving illustrative results for the 3-month 
treasury rate. The path shown was selected by hand, but no fit was performed. 


R3 Tsy 1950-1996 vs Model path #19 (Not a fit) 


—— КЗ Tsy path —— R3 Tsy data 


Rate % 


Jan-50 
Jan-53 
Jan-56 
Jan-62 
Jan-65 
Jan-68 
Jan-71 
Jan-74 
Jan-77 
Jan-80 
Jan-83 
Jan-86 
Jan-89 
Jan-92 
Jan-95 


This shows that general behavior of the data can be qualitatively reproduced 
by the Macro-Micro model over a wide range of rates and over long time periods. 


The bottom line is that Macro-Micro model is capable of providing realistic 
results on long time scales for interest rates. 
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Data, Models, and Rate Distribution Histograms 


In the discussion of data and models in Ch. 48, 49 we presented histograms of 
distributions of rates for data and also for a set of Monte-Carlo model paths. All 
models received as input the data volatilities. While the strong mean-reverting 
Gaussian (SMRG) model distribution was in accord with the data, the distribution 
histograms did not agree for the lognormal (LN) model. 

In Ch. 48, we mentioned that, path-by-path, the model output will produce a 
distribution of width on the order of the output volatility, which is (within 
statistical noise) the same as the input data volatility. The problem arises if we 
sum over the distributions of paths. Each path produces a distribution of 
approximately the right width. However, without strong mean reversion, the 
center of the model distribution for a path is displaced randomly away from the 
center of the data distribution. When summed over paths, the model distribution 
without strong mean reversion will appear spread out, even though the output and 
input volatilities are consistent. 

To illustrate, we use a simple model that generates “data” using a Gaussian 
process with a set of fixed random numbers. We also use a Gaussian model, with 
the same input volatility, as the “model”. 

The graph shows a sample run of the Gaussian model along with the “data”. 


Deviation from Data QE Path, one run 


m Data ш Model 


N events 
№ 
о 


-0.90 
-0.72 
-0.54 
-0.36 
-0.18 
1 
0.54 
0.72 
0.90 


The deviations are measured with respect to long-time-scale trends, defining 
the quasi-equilibrium historical path, as explained in previous chapters. For the 
toy model, we used a straight line with slope 4/7 P^ = Cae E ) / N , where 


к was the average of the “data” points and N the number of time intervals. 
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The widths of the model and “data” distributions are roughly equal (as they 
should be), but the model distribution is displaced relative to the “data” (in this 
particular case, to the left). 

In other runs, the model distribution will be displaced to the right. This 
phenomenon generates a wider distribution of the model over many runs with 
respect to the data, even though each run has (within statistical error) the same 
width as the data. 


Negative Forwards in Multivariate Zero-Rate Simulations 


In this section, a simple lognormal simulation for zero-coupon rates without 
mean reversion is shown to generate negative forward rates. We use a 
pedagogical 5-dimensional simulation for 21 unit time steps. The simulation is 


for term forward zero-coupon rates f (9 (ЕТ Ji cf. Ch. 7. This rate, observed at 
time f , applies between dates (t,t + T) . 

Now consider the forward rate relabeled f vee) (=f (+7) (Ое, 
observed at time £, which applies between dates (7+ T,t-- T +1). The graph 


below shows the results for one run that produced negative values / ^? (t) <0 
(“FwdFwd 34”) for t 215. Other forward rates went negative in other runs. 


Calculated Fwd-Fwd Rates vs. Time 


—e— Term Fwd 01 —s—FwdFwd 12 —4— FwdFwd 23 
—mg— FwdFwd 34  —— FwdFwd 45 


20.0 


10.0 

E 

g 

© 

x 0.0 T T T T T T T 
-10.0 
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In the simulation, correlations and volatilities used were typical values 
р") €(0.55,0.9), с!) €(0.1,0.2). We started at time г, with an upward 


sloping forward curve. No drift was included. No-arbitrage drift corrections are 
quadratic in the volatility and generally small. 

This observation generalizes Ch. 48, where coupon yields were used. 
Negative forward rates are forbidden by no arbitrage considerations. Our 
conclusion remains: multivariate models using composite rates without strong 
mean reversion may be problematic. 


If the forward rates themselves are simulated directly and not derived from 
composite rates, then a zero-rate barrier imposed by lognormal or other dynamics 
can be imposed to prevent negative rates. 


Finance Models Related to the Macro-Micro Model 


Derman’s Equity Regimes and the Macro-Micro Model 


In the chapter on volatility skew, we discussed various regimes for equities as 
characterized by Derman (trending, range-bound, jumpy). These regimes seem to 
correspond to the characteristics of the Macro-Micro model. In the Macro-Micro 
model, we have quasi-random drifts or slopes corresponding to trending (with 
positive or negative drifts) and range-bound behavior (small drift), all with 
limited volatility. Jumps play a supplemental role in the micro component. 


Seigel’s Non-equilibrium Dynamics and the MM Model 
Les Seigel" has proposed an innovative and interesting time-dependent 
dynamical framework. He writes a relation between the stock price S and stock 
index level J as S = S(1,t), allowing for a specific time dependence. Assuming 
lognormal dynamics for the index produces a diffusion equation for the stock 
price, using the usual rules. Solutions of the diffusion equation produce 
relaxation time scales for non-equilibrium behavior for stocks relative to the 
index. 

Seigel finds long characteristic times, on the order of years. These are 
representative of Macro times in the Macro-Micro Model. 
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Macroeconomics, Economics Literature, Macro-Micro Model 


This section presents references to models in the economics literature for interest 
rates and for FX. These models all relate to aspects of the Macro-Micro idea’. 

The development of the Macro-Micro model took place in the world of 
finance. The association of macroeconomics with the Macro component was an 
hypothesis that came at the end of the investigation. 

It is quite gratifying to see after the fact that the Macro-Micro idea fits in with 
strikingly parallel ideas developed in the world of macroeconomics. 

Indeed, it is perhaps high time that the economists and the finance gurus 
started talking to each other regarding the underlying dynamics of variables over 
different time scales. 


The Fed and the Macro Interest Rate Component 


The fact that the average long-time-scale behavior of the short-maturity sector of 
the yield curve is driven by the Fed is obvious. Innumerable articles and papers 
exist describing and commenting on this situation. Fed policy is a large subject, 
but all we need here is that the Fed actions are driven and determined one way or 
the other by macroeconomics. Thus, the average behavior of the short end of the 
yield curve is driven by macroeconomics. To the extent that the long-maturity 
end of the curve is correlated with the short end, the average long-time-scale 
behavior of the whole yield curve is driven by macroeconomics. 

This idea was in fact a strong part of the motivation behind the original 
construction of the Macro-Micro model in 1988. 


Interest Rates: TIPS (Bertonazzi and Maloney) 


In Ch. 16, we discussed TIPS. In that chapter, we mentioned an analysis of 
Bertonazzi and Maloney". They stated: "The implication is that inflation is the 
biggest component of variance in the yield on government bonds". This is 
consistent with the Macro-Micro model. The most straightforward consequence 
of the Macro-Micro model is that most interest-rate variation results from the 
Macro component with macroeconomic causes, with small fluctuations due to the 
Micro component. Bertonazzi and Maloney identified inflation as the main 
macroeconomic driver for government bonds. 


FX: Fat Tails and Currency Crises (Kaminsky/Reinhart, Omarova) 


So far, we have considered the Macro and Micro components of the Macro- 
Micro model to be independent. We now quote some evidence that these two 
components may be connected in the FX market. In particular, the improbable- 


7 Partial Literature Search: We have not performed a systematic search of the 
literature. 
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event fat-tail jumps that form a critical part of the Micro component may be 
directly related to macroeconomics’. 

It has long been a goal of economics to predict currency crises. Using the 
language of this book, the topic is the attempt to establish a connection of long 
time scale macroeconomics with fat tails in the FX market. A body of work that 
has made significant progress along these lines has emerged in the economics 
literature. This work was pioneered by Kaminsky and Reinhart (KR). The field 
was significantly extended by Omarova?, and we follow this work as described in 
Mills and Omarova (Ref. viii). 

Currency crises are first defined historically using a “currency pressure 
index” defined by KR that involves a normalized sum of relative changes of the 
exchange rate? and relative changes of FX reserves. When this currency pressure 
index drops enough from its average, a currency crisis is defined. 

Next, threshold levels are defined for various economic variables. A “crisis 
window” before one of the historical crises is defined during which a positive 
alarm signal is considered to have been successful and relevant for predicting that 
crisis. Threshold levels for each economic variable are defined by minimizing 
combinations of type-I errors (missed crises) and type-II errors (false alarms). 
Two threshold levels (*stress" and *critical") are defined. 

The next step constructs the alarm signals. Alarm signals are defined by 
indicator functions of economic variables that go above their respective threshold 
levels. A composite signal-alarm indicator function is constructed for each 
country. If this alarm goes above a certain level, a currency crisis is signaled. 

Backtesting and out-of-sampling testing show that the results are successful. 
The Omarova model produces 79% correctly-called crises over many countries, 
with over 40% probability of crisis given an alarm. 


FX: Time Scales (Feiger and Jacquillat) 


A striking and straightforward description of the FX market containing separate 
dynamics for short-time-scale and long-time-scale descriptions is found in the 


* Acknowledgements: I thank Nora Omarova for very helpful discussions of her work 
(Ref. vii). 


? REER: Actually, the *Real Effective Exchange Rate" is used by Mills and Omarova. 
For a given country, the REER is a trade-weighted combination of inflation-adjusted 
(“real”) bilateral exchange rates for that country with its top trading partners. 


10 Economic Variables: Mills and Omarova use eight variables. They are the REER, 
Short Term Capital Inflows (% GDP), Current Account Balance (% GDP), Short Term 
Debt/Reserves, Equity Prices, Industrial Production, Exports, and M2/Reserves. 


Chapter 51: Macro-Micro Model: Further Developments 851 
work of Feiger and Jacquillat ° ''. The description of these dynamics seems 
qualitatively very similar to the Macro-Micro model. 

These authors state in Chapter 5: Exchange Rate Behavior in the Long Run: 
“First, it will become apparent that exchange rate behavior can be understood 
only in the context of a global macroeconomic perspective.... Understanding is 
better for the material dealt with in this chapter and not very good for short-run 
exchange rate behavior analyzed in Chapter 6”. This is very similar to the 
statement of the Macro long-term component. 

These authors state in Chapter 6: Short-Run Exchange Rate Movements in a 
Free Market: “Thus we propose that, in the short-term, the spot rate is simply 
pulled around by the actions of speculators in the forward exchange markets and 
by trading on the capital markets”. This is quite similar to the statement of the 
Micro short-term component. 


FX: Time Scales (Blomberg) 


S. B. Blomberg" "mentions work in the literature that ascribes different 
dynamics along the lines of the Macro-Micro model. For short time scales (the 
Micro component region) he states: Starting with the influential work of Meese & 
Rogoff (1983) a large body of research has documented that the random walk 
model out-performs a wide class of structural and time series (univariate and 
multivariate) models in short run out-of-sample prediction. 

For long time scales (the Macro component region) he states: However, 
Taylor (1995) suggests models that rely on rich dynamic specifications can beat 
a random walk over the long horizon, see Mark (1995) and Chin and Meese 
(1995). 


FX: Time Scales (Chin) 


M. Chin“ also distinguishes short and long time scales with different dynamics in 
the FX market. He states: In the short run, nominal exchange rates depend 
primarily on financial market variables and expectations.... At long time scales, 
he states: / then present results of empirical tests of the theory, which suggest 
that, in many instances in East Asia, enhanced productivity growth in the 
tradable goods sector is associated with a long-run strengthening of the real 


" Acknowledgement: І thank Alain Nairay for pointing out the work of Feiger and 
Jacquillat to me. 


12 Short Time Scales: Blomberg's focus is on alternative descriptions at short time 
scales. He presents evidence that short-run exchange fluctuations can be described using 
“narrative measures" involving certain dummy variables of monetary policy. 
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exchange rate". Again, a division of dynamics between short and long time 
scales is present in the Macro-Micro approach. 
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52. A Function Toolkit (Tech. Index 6/10) 


In this chapter, a “toolkit” of functions potentially useful for analyzing business 
cycles is presented. These functions could then form part of the macro 
component of financial markets operating over long time scales "'. These 
functions may also be useful on shorter time scales for trading, as described at the 
end of this chapter. 

The functions were originally used in describing some threshold phenomena 
in high-energy physics ^^ and subsequently in engineering "^ °. Applications 
using simpler forms of these functions have been used in finance for a long time. 
The functions can be used to specify a general form of a drift. 

The first part of the macro component is composed of quasi-random trends, 
discussed in Ch. 50. The connection of the formalism with derivative pricing was 
shown. A general description of the macro component would be a combination of 
these quasi-random trends and the function toolkit cycles ™*. 


In this chapter, the signals x(t) are first assumed deterministic without a 


stochastic random component. While the macro component as used in this book 
does have a postulated quasi-random behavior, this behavior does not scale down 
to small times. Hence, a given realization of a macro path essentially acts as if it 


' Cyclical Analysis and the Macro Component of Finance: We believe that the 
cyclical analysis approach to financial time-series analysis, with long-standing 
proponents, is valid for part of the macro component. Our purpose here is to introduce a 
set of functions that could be useful in such analyses. 


? Connection of the Function Toolkit with Thresholds in High-Energy Physics: The 
application was the description of data in high-energy diffractive scattering. The 
translation of the formalism: the function f(t) is the imaginary part of a quantum- 
mechanical amplitude, time is the logarithm of the beam energy, and the time thresholds 
correspond to the successive production in energy of particles with different quantum 
numbers (like strangeness and charm), having successively increasing masses. 


> Connection of the Function Toolkit with Time Thresholds in Engineering: The 
application was the description of the measured response of some equipment in spatially 
separated locations to input transient electromagnetic fields. The time-threshold behavior 
occurred because different parts of the equipment reacted to the electromagnetic fields at 
different times. The example presented in this chapter comes from this work. 


^ Quasi-Random Trends and the Macro Component of Finance: Trends over random 
but macro-length time periods (“quasi-random”) are the first macro component. We 
describe a macro model for quasi-random secular trends in Ch. 50. 
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were deterministic. We are now faced with a classic signal-processing problem. 
One basic problem in signal analysis is the choice of reasonable functions to 
describe the data. The function toolkit sets out to address this problem. 

The practical applications so far have involved performing a least-squares fit 
of a given signal with a limited number of functions, in either the time domain or 
the frequency domain. This is parametric signal estimation. We are not after 
“exact” results. As mentioned above, the method has already proved to be 
practical and useful for good characterizations of various data. 

At the end of the chapter, we consider the relation of the toolkit functions to 
other techniques, including wavelets. 


Time Thresholds; Time and Frequency; Oscillations 


Many signals have an arrival onset-time threshold nature’. In finance, these time 
thresholds may have a clear significance—something definite happens at a given 
time, providing an excitation to the financial system. There is usually some 
inertia in the system, so the reaction of the variable x to the excitation may have 
a continuous lagged response, with some time to ramp up. The reaction naturally 
has some intrinsic magnitude. In addition, the initial reaction may be either 
positive or negative, so there is an initial phase. The reaction will damp down 
over some time period, depending on the excitation and situation. Various 
successive excitations can occur in a given macro time period. 

Signals that are due to given excitations tend to be limited in time. Signals 
also tend to be band-limited in frequency w. There may be a tendency to oscillate 
as markets go higher and lower in response to an excitation. First the market may 
over-react going (say) down, then may over-react in a rebound fashion going up 
in some characteristic up-down time Tą. Continuation of this behavior forms 
oscillations with frequency о, = 1/т,. However, things are not so definite 
because successive up-down movements may have reaction times influenced by 
the environment. For a given excitation a, the characteristic up-down time Ta 
occurs with some uncertainty At,. This determine the bandwidth of the 
oscillations Aw, if we use the standard uncertainty relation Ac, =1/At,. 


Choosing oscillations with various weights inside this bandwidth produces a 
variable period of oscillations in the time series x(t). However, the nature of the 


variable period may be even more complicated than indicated by this standard 
uncertainty relation. 


? Thresholds: Please note that here we mean by threshold a point in time characterized 
by a time delay, not a minimum value for the signal. 
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Summary of Desirable Properties of Toolkit Functions 


The description above indicates that functions with the following seven 
properties should be useful in describing business-cycle behavior in finance: 


Time-onset threshold t, ug 


Ramp-up time т 


ramp—up 
Damping time оаа 
Oscillation time Т, 


Variable period in oscillations 
Intrinsic magnitude 
Initial phase 


NAM KR WN Ln 


With functions of this sort, the goal is to provide a physically motivated and 
economical description of time series. Each function has seven possible 
parameters. 

The main point is that because each function looks like a physically possible 
signal, the number of functions required can turn out to be small. 


What about Gaps or Jumps? 


We could put in gaps or jumps over short times as a separate requirement. 
However, our interest here is in behaviors over macro time periods (weeks to 
years). We can in fact model a gap behavior by specific choices of parameters, as 
we shall see. Moreover, in so doing, the gap behavior will have a natural 
associated time scale. However, a more natural explanation for gaps or jumps is 
perhaps nonlinear diffusion, as described in Ch. 46. 


Construction of the Toolkit Functions 


A set of functions f (t) that has all seven of the features is defined as follows*. 
f(t{4,})=Re[|Cle”O(r)¥*e* | (52.1) 


We have indicated generically the parameters by {4,} . These parameters should 


be understood when we write f (t). We now describe the parameters 
individually. 


* Practical Functions: Examples using simpler functions that are subsets of the f(t) 
toolkit functions have been used on the Street and are in the finance literature. 
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Time Thresholds 
An explicit threshold time „ыша is built in, and Y is the time difference from 
that time: 


Y == і hreshold (52.2) 


The usual Heaviside function makes f(t) = 0 below tpesroa> 


0 (t < Lihreshold | 


(52.3) 
1 (ї > bres) 


em-| 


Power-Law Takeoff 
Write the complex parameter 2 as Z= 2. *iz,, with z,, 20. If 2, > 0, а 


power-law increase prevents f (t) from taking off suddenly from threshold. The 


ramp-up time Tramp-up is provided by the real part of Y^ . Setting ү “l= e, we get 


T mp = ехр(1/2,.) (52.4) 


Damping 


The parameter @ = Æge t io, is complex, and the real part Æge <0 provides 


Im 


Y 
e^ |-l/e we 


the characteristic damping of the function in time Tdamping. Setting 


get 


T damping == l/a. (52.5) 


Oscillations—Standard and Nonstandard 
Oscillations œ of a standard nature are provided by the imaginary part o =, 


that is conjugate to Y . If we take a half-period as characterizing an oscillation, 


(im) 
Osc 


we have the characteristic half-period oscillation time т given by 


Tom mar (52.6) 
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Variability in the oscillations can be taken into account with a weighted sum 
inside a bandwidth, but there is another possibility provided by the factor Y^ 


using the imaginary part 2, . This variable is conjugate to In(Y ) instead of У. 


Im 
If a, =0, f (t) oscillates with characteristic time for the first half oscillation 
as 


т Rph] Zin) (52.7) 


OSC 


Since this expression is nonlinear, the oscillations are not uniform. The 
second half of the first oscillation takes place over the time 


[ exp (2z/ Zn ) — exp (z/ Zu )) that is different from the time for the first half of 


the first oscillation. 
Now if both о, and Zm effects are present, we can for small z,,, find the 


first half-oscillation time perturbatively. Write т, = т") „бт and expand 


OS 


Zim In 7,,, to first order to get ÔT „„. We get the approximation for T, as 


(am) 
Zin (2 | ) 


OSC 


& r | ] 2 — —— —- 52.8 
fa (Zim +7) B 


Magnitude and Phase 
Parameters | C | and y specify the magnitude and phase of f (t). 


Example of a Function Toolkit Application 
To illustrate the richness of the phenomena that can be described with the toolkit, 
here is a graph of a series x(t) built up by some functions f (t) : 
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Example of a Signal Described by the Function Toolkit 


—m- Total Й+...+#7 signal 


0.6 
0.4 
0.2 


0 . 0 "ee 


-0.2 
-0.4 
-0.6 
-0.8 
-1.0 


There is at first a sharp drop and rebound, followed by a sequence of some quasi- 
oscillatory behaviors. Two terms /,, f, are used to produce the drop/rebound: 


Functions f, f, Describing the Drop and Rebound 


—e— Sumf1 + #2 


54 
6.0 
6.6 
72 
7.8 
84 


9.0 
9.6 


0.0 
0.6 
1.2 
1.8 
24 
3.0 
3.6 


4.2 
4.8 


Then the rest ofthe seven f(t) functions f,...f, describe the rest of the series: 
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Function f,...f, past the Initial Drop/Rebound 


—e— 4-4 —a— f5 —=— 6 а 


All seven features described above can be seen in these functions, including the 
time thresholds, ramp-up, damping, oscillations (sometimes with anomalous 
features) and phase. Their appearances are qualitatively different. The parameters 
for these functions follow’: 


Parameters for the Signal Components 


Term | tmresn| ак] ап] С y| ә Zm 
| 1| 0.022 -7.586 8.783 4.456 3.159! 0.469 0.070 
| 2| 0.160] -11.038| 8599] 1.041] 0.276 0.000] -0.020 
| 3| 0.266] -1055) 0505) 0.199| 261] 0.897 0.400 


а одон am| 2962| 0102[ 2975]  2.023| -0.020 
[  s| 125] -2046| 1.934| 0.269 2640] 0992] 0.000 
[ 6[ т] 22] 1102] 606462] -2.101|  7.465| 0.000 
[ 7| 4390 _-0.719[ 106] 0054] 2748[ 2828] 2.133 


Laplace Transform of a Function f (t) in the Toolkit 


The Laplace transform of any function f (t) is defined as usual as 


7 What is this Example? This is the seven-term fit of the response to an electric field in 
ref ili with somewhat different parameters. This example exhibits many of the 
characteristics that we have seen in macro financial time series. 
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L(J;{Ag})= fe” л(к{л„))а (52.9) 
0 
The inverse is 
fF (5(4,]) _ Тель) (52.10) 


Again the parameters Ар are those described above. In Eqn. (52.10), c is taken to 
the right of all singularities of L (7 ; {4 в }) in the complex J-plane. The 


connection with the usual Fourier transform variable is J = —ic. 

Since we have thresholds in time, they must be reproducible by the Laplace 
transform. This is the case; for Y < 0, we close the contour in the right-half J - 
plane, producing zero for f(t) below бы as desired. For reference, the 


Laplace transform of f (t) is: 


L (-io; {4, |) = ; | A | el Clthreshold е = e 


(-io – a)" (0-а *) И 


(52.11) 


Неге, 
| 4|e* 4 C | e"T (z 41) (52.12) 


If z is an integer, there are multiple poles in Eqn. (52.11). If z is not an integer 
there are branch cuts; these are treated by standard techniques. Also T(z + 1) is 


the usual gamma function; if z — n is integral, T(z + 1) = п!. 


Relation of the Function Toolkit to Other Approaches 


We now compare the function toolkit approach for describing part of the macro 
component of time series to several other approaches for time series analysis: 


Fourier Series * 
Prony Analysis " 
Wavelets "' 
ARIMA Models "! 
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Function Toolkit Relation to Fourier Transforms 
A term in a Fourier series is a sine wave, and therefore is a special case of f (t) 


with no thresholds, ramp-up, or damping. Although a general method, Fourier 
series have two drawbacks. The first is that a sine wave does not look much like 
any signal in practice. Second, to describe real signals even approximately, a 
large number of terms are needed. Time thresholds are problematic to describe 
because sine waves start infinitely far in the past and go on forever. 


Function Toolkit Relation to Prony Analysis 


Prony analysis? was invented in 1795. Prony analysis forms a special case of the 
analysis described here, lacking the time thresholds’. Prony functions have 
damping via @ 


re» and oscillations may be present with 0. #0. However 


z = 0. Prony analysis is applicable for radar, where an object has apparent 
dimensions small compared to the distance to the object. Then, a returning signal 
from the object arrives back at essentially at a single time f, . An infinite number 


of Prony terms starting at £, are needed to reproduce a time threshold past f, . 


See Ch. 44 for use of Prony functions as interpolating functions for options in 
American Monte Carlo. 


Function Toolkit Relation to Wavelets 
Wavelets have produced an explosion in numerical analysis". The spirit of 
wavelets is very close to ours; indeed our f (t) functions can be considered in 


some sense as wavelets with more complex parameterizations than usual. 

Perhaps the main advance for wavelets is that expansions can now be carried 
out in systematic fashion. Wavelets can have different time thresholds with a 
limited time extent, so they look more like signals than Fourier sine waves. 

Still, from our perspective a disadvantage of the wavelets used in practice is 
that an individual wavelet is chosen to have a simple shape with only a few 


* Acknowledgements: I thank Santa Federico for bringing Prony analysis to my attention. 
We had some good times at Bell Labs applying Prony analysis and generalizations along 
the lines of the functions described here. 


? Who was Prony? The Baron de Prony (1755 - 1839) was a French engineer and a 
contemporary of the famous mathematician Fourier (1768 - 1830). It is interesting that 
Prony, besides inventing Prony analysis, also collaborated in measuring the speed of 
sound, supervised the construction of logarithmic tables to 19 decimals, invented a type 
of brake, and modernized several ports of Europe. 


10 Acknowledgement: I thank Alex Grossmann for informative discussions on the 
mathematical properties of wavelets, and also for lots of fun playing music. 
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parameters, and so still does not look too much like a real signal. Therefore, a 
large number of wavelets are needed (though fewer than for Fourier series) to 
describe a time series to some accuracy. 

Again our philosophy here is to obtain a reasonable description in terms of 
only a very few f (t) functions that individually generically resemble real-world 


signals to begin with. The functions f (t) have the properties of both “time- 


scale” wavelets and “time-frequency” wavelets. 
The Appendix discusses wavelets a little further. 


Function Toolkit Relation to ARIMA (p,d,q) Models 


ARIMA(p.d,q) models'"'? describe a time series in terms of itself and input 
Gaussian noises. ARIMA models are complementary to the method we are 
proposing. We are proposing to describe at least part of the macro long-time- 
scale behavior of time series using the function toolkit. This does not include 
noise. Noise is supposed to be assigned to the “micro” component, appropriate 
for the description of short time scales, and ARIMA models can play an 
important role in describing micro component. ARIMA models include memory 
effects. 

Memory effects have been investigated, as described in Ch. 43 (Section V 
and App. B) *. 


Example of Standard Micro “Noise” Plus Macro “Signal” 

We now illustrate the incorporation of micro noise along with the macro toolkit 
functions. So far, the series x(t) has been deterministic, built up by the “signal” 
хб" (t) of the toolkit f (f) functions. Based on empirical studies, the 
appropriate micro noise x ^ (г) seems to be highly mean reverting, and we use 


this property. The total is x(t) = х" (1) + x ^ (t). The noise changes are 


1! ARIMA(p.d,q) Models - Summary Description: Several stages are combined to get 
an ARIMA(p,d,q) model. The first is a moving average MA(q) and the second is an 
autoregressive AR(p) that together form an ARMA(p,q) description. The MA(q) model 
assumes that each x(t) is generated by a weighted average of q independent Gaussian 
noise terms G(t — jdt) with j = 1,...,q. The AR(p) model assumes that x(t) can be written 
as a weighted average of p prior x(t — kdt) values with К = 1,...,p. If the series x(t) needs 
to be time-differenced d times in order to be described by an ARMA(p,q) process, then 
x(t) is itself an integrated ARMA(p,q) process. This is called an ARIMA(p,d,q) process. 


? Joke: Even though ARIMA models are described by the parameters (p,d,q) there is no 
evidence that Prof. Peter Schickele had any influence on the topic. See P.D.Q. Bach’s 
opera video, The Abduction of Figaro, 1984. Get it? 
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given by d,x ^* (ts х^ (t)dt + сүа (0,1) for the time evolution with 
mean reversion @. An illustrative x(t) path with the noise specified by 


ædt = 0.5 and суа = 0.03 is plotted below: 


Macro Signal {f (t)} + Mean-Reverting Gaussian Micro Noise 


—e— Tot Signal + Noise —— Total f1+...+f7 signal 


The sharp drop and rebound produced by f, f, are not affected by the noise. 


Past this point, the functions /,.../, take over. The noise, being mean reverting, 
bounces around the signal function sum. 


Connected 3-Point Green Function / Auto- Correlation Analysis 


In Ch. 49, we emphasized the importance of the third order time-dependent 
Green functions or correlation functions. These were employed and provided a 
sharp test in the analyses of interest-rate data that led to the Macro-Micro model 
in 1989. We anticipate that this type of analysis will continue to be a useful probe 
in obtaining workable models in the present context. For those who skipped Ch. 


49, we recall the formalism. Define т, = KAt and x" (t) 2 x(t-7,) with k 
lags of time step Af. We work here with time-differenced series 
d,x(t)=x(t+dt)—x(t). Define M® =M (r,) as the time-average of 


dx" (t) and call greco) the | d,x (0), d,x ^ (1) | covariance 
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function. Then define М пе (for N observation points) by 


1 N 
us 2,42 (04x )d x) . The connected 3-point function 
1=1 


(123) _ : 
С; = С,(т,т,,т,) is 


3 
(099 = M» 7 У COM -Пмі (52.13) 


CyclicPerm(ijk) k=l 


We use the exponentiated version of the formalism just described. For 
illustration, we calculate with lags (actually forward moves) of (0,0), (0,10), and 
(4,10) measured in units of At = 1. The results for the signal (taken as the sum of 
the functions f; to б) and for the noise are plotted below”: 


Plots of СЇ?” / Mf"? for the Sample Path 


3rd-Order Auto Correlations 


—e— Noise C123/M123 —m-— Signal C123/M123 


-4.0 
Lags(0,0) Lags(0, 10) Lags(4, 10) 


Lags in 2nd, 3rd series are (k,k;) 


We see that for this illustrative example, both the signal and the noise satisfy 
СОЗУМ & 0 as the lags become nonzero. Again, for a real application we 


P? Parameters: The signal had n = 100 points and the noise had n = 543 points. Also, At 
was taken as one unit (arbitrary). The results for the noise can be calculated analytically, 
but we present the Monte Carlo results as illustrative of what happens in practice. 
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would use the constraint СЇ?” /M‘'*) ~0 as a probe to indicate a reasonable 
model and a result like С = М Es to indicate a failure. Thus, for example, if 


we have data r 


data 


would write the putative stochastic equation involving a “data-defined” Wiener 


(£) and we believe that a lognormal noise model is relevant, we 


noise term dz, (f) as follows: 


d, Гата (t) с 


шда + o, dz, (t) (52.14) 
Flata (t) 


We solve for dz, (t) as 


d 
dz, t) = E - 7] / m (52.15) 
data 


We then check to see if C^? / M(?? & 0) for dz, (f) or for its exponentiated 


form exp(dz,,,,(t)). If so, the model passes the test. If not, we need to change 
the model. 


The Total Macro: Quasi-Random Trends + Toolkit Cycles 


We have presented the macro-micro model with its different time scales using 
two complementary models for the macro component. The first, introduced in 
1989 and described in Ch. 50, was a quasi-random trend or drift model, involving 
random slopes over random time intervals, with a cutoff lower bound time. The 
second, introduced here, is a function toolkit basis for cycles of general shape. 
Economists use both trends and cycles", and we will generically follow the idea 
in this section”. 

The most general macro model would be a combination. Calling QRT = 
quasi-random trends and TC - toolkit cycles for short, 


Src aa Ар Jo Coe T (s (45?) (52.16) 


" Random Trends and Stationary Cycles? We differ from those economists who 
assume that random trends are stochastic on a monthly basis and cycles are stationary 
statistical quantities. For us, trends are quasi-random with explicit time scales, as 
described in a previous chapter. In addition, cycles here are not stationary, but are 
constructed from the function toolkit. 
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In the absence of a theory of macroeconomics connecting with our formalism, we 
would perform numerical phenomenology to fit parameters with historical data. 
Forward pricing with the macro-micro model would use implied macro 
parameters that would specify the future macro paths. Similarly to the quasi- 
random drift macro model, the cyclic-toolkit macro component would be added. 


To illustrate, take the lognormal equation for an interest rate r(t) with 


x = In(r) and a label “Macro” placed on the drift иш, viz 
d,x(t) = uuu, (dt + C ydz(t) (52.17) 


We want to use the quasi-random drifts and toolkit cycle functions f (s { A ‚}) in 
the drift Haco» and we have purposely exhibited the parametric dependence 


{A,} . SO we write 


Hr D Mns (6{Ag}) (52.18) 


{Ag }cSet, 


We have already described the probability function for the macro quasi- 
random-trend component. 

For the toolkit cycle component, we have less intuition. For example, we 
could use a coherent-state prescription involving a Gaussian in the variable 


@ around a central frequency «y and width oc, : 


1 —1 2 
= 52.19 
plo] aa о (v — e) | ( ) 


oO 


The parameters @ and o,, would be implied from long-term option data or 
taken from historical analysis. Other parameters would be similarly defined. We 
would want to respect the use of macro time scales above a cutoff т as 


discussed in the quasi-random trend Macro component. 


Short-Time Micro Regime, Trading, and the Function Toolkit 


Although the main purpose of this part of the book is the description of long- 
time-scale macro behavior, we digress briefly to discuss short time-scale 
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extrapolations in the micro time regime where most trading occurs. We will only 
discuss a few issues qualitatively". 

Short-term micro dynamics are different from the long-term macro dynamics. 
In particular, gap jump behavior at short time scales is extremely important. 
Intermittent trading at some time scale occurs for all products'5, producing gaps 
between times when trading does occur. Sometimes these jumps are large. 
Psychology and quick response to news is a driver at short time scales ","". 

The efficacy of techniques depends on the quality of real-time data feeds, as 
well as historical data for backtesting. ! 

A description of various trading applications is in the references" 


Trading and the f (t) Function Toolkit 


The main point of this short section is to suggest that the function toolkit may be 
useful in trading applications. The idea here would be to do real-time fast fits to 
recent data from feeds using functions from the toolkit. These fits would be 
projected into the future for a short time. 

Algorithms using simple functions have been successful, so we think it is 


likely that the more general f (t) functions could provide additional benefit". 


Appendix: Wavelets, Completeness, and the Function Toolkit 


Examples of Wavelet Variables 


The set of wavelet variables can, for example, be time translation and damping 
(dilation). These are “time-scale” wavelets. The wavelets suggested by 


'S Proprietary Short Term Methods: Most short-term extrapolation methods used in 
trading are proprietary, so only generalities can be given here. 


^ Intermittent Trading: Trades only occur at finite time intervals, small for liquid 
products and long for illiquid products. The description of prices between times when 
trading occurs is just an assumption that cannot be tested since those prices don’t exist. 


7 Vol “News” Models: Some short-term volatility models try to estimate the differential 
impacts of specific future news events, e.g. an FOMC meeting. 


'S Analyses Using f(t) Prescriptions: Explicit fits to historical data have been used 
involving a few sine waves along with a trend, as input to some trading strategies. We 

consider this as an existence proof that the f(t) function toolkit, which contains these fits 
as a subset, will be useful. 


' Crash Oscillations: An excellent example of damped oscillations occurred in day 
trading for a few days after the 1987 stock market crash. The market “rang like a bell”. 
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Grossmann and Morlet (Ref. vii) were time-translated Gaussians with different 
widths. Physically this has the attractive picture that a bump in the signal at some 
time £ is approximated by a Gaussian centered at £ with an appropriate width. 
Another possibility is the “time-frequency” wavelets first introduced by Gabor 
and related to Gaussian coherent states. Here variables are chosen as frequency 
and time translation with time decay. This is analogous to a note with some 
frequency or pitch œ - say middle C - in the К" measure of a piece of music, 
lasting one beat. The time-frequency uncertainty relation is not violated since the 
note will have a frequency uncertainty (think of vibrato). Enough of these 
functions are used to cover the entire time-frequency plane, although the choice 
of exactly how the covering is made is not unique. 


Completeness and a Plea to the Mathematicians 


One major issue is to find a series expansion of a suitable given function in terms 
of a complete set of wavelets. Completeness means that a unique set of 
coefficients can be determined for the expansion of a function in terms of a set of 
orthonormal basis functions. Orthonormality means that two basis functions 
when multiplied together and suitably integrated give zero if they are different 
and one if they are the same. For certain appropriate sets of wavelets, 
completeness theorems hold. 

Our f (t) toolkit functions have more parameters than the usual wavelets. 
This makes the expansion theoretical analysis more difficult, and no 
completeness theorems have been proven. The set { f (t)} for all possible 


parameters is actually “over-complete”. That means that there are “too many 
possible f (t) functions”. We do not want to just eliminate some parameters, 


because they are all useful, as we have seen. We do not know how to restrict the 
parameter values to get a complete set. 

We avoid this theoretical topic in practice by simply choosing a small number 
of functions and performing a least-squares fit to a time series to fix the 
parameters of these few functions. For this brute-force approach, we essentially 
do not care if there are potentially “too many" choices of such functions for all 
possible values of the parameters. 

Indeed, the attractiveness of the approach is supposed to be that because the 
toolkit functions start out looking like real signals, only a small number of 
functions is needed for a good approximation. 

Still, it would be nice (this is a plea) if some of the powerful wavelet 
mathematicians would figure out how to specify a subset of these f (t) functions 


that is just complete. 
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53. Climate Change Risk Management: Business, 
Economy, Finance, Society (Tech. Index 5/10) 


Summary: Climate Change Risk Management 


There is no ‘plan В’, because we do not have ‘Planet B’' 


Climate change is posing increasingly serious risks to business, the economy, 
investors and finance in particular - and to human society in general. Climate 
change impacts constitute the biggest risk discussed in this book'?. R. Rubin’s 
article “How ignoring climate change could sink the U.S. Economy” echoes 
warnings from M. Bloomberg, H. Paulson, R. Litterman, C. Lagarde, and the 
S&P Global Credit Portal" and many others. I think the ensemble of increasingly 
serious climate impacts can potentially destabilize the inherently unstable 
worldwide financial and economic systems into deep crisis. Climate change risk 
should be considered seriously in decisions by investors, businesses, and society. 


' Acknowledgments: I am very grateful for helpful comments on this chapter by: Prof. 
John Abraham (U. St. Thomas), David Andelman (World Policy Journal), Dr. Phil 
Blackwood (and his red Tesla), Prof. Bruce Bueno de Mesquita (New York U.), John 
Cook (U. Queensland), Lynn Dash, Sarah Dash (Alliance for Health Reform), John 
Englander (Author: “High Tide on Main Street’), Mark Fulton (Energy Transition 
Advisors), Prof. Katharine Hayhoe (Texas Tech U.), Dr. Gerald Herman (AIG Asset 
Management), Aaron Huertas (Science Communication Officer, UCS), Prof. Robert 
Kopp (Rutgers U.), Rev. Earl Koteen (Environmental Justice Minister), Adam Litke 
(Risk Products, Bloomberg L.P.), Dr. Bob Litterman (Kepos Capital), Prof. Scott Mandia 
(Suffolk County Comm. College), Miranda Massie (Climate Museum Launch Project), 
Scott Nystrom (Regional Economic Models, Inc.), Curtis Ravenel (Sustainability 
Initiatives, Bloomberg L.P.), Brian Reynolds (Climate/Money/Policy), Dr. Danny Richter 
(Citizens’ Climate Lobby), Joseph Robertson (Citizens? Climate Lobby), Prof. Alan 
Robock (Rutgers U.), Gabriel Thoumi, CFA (Calvert Investment Management, Inc.), 
Jeremy Tomasulo (Former Press Secretary, US H.R.), Brett Whysel (BMO Capital 
Markets). I also thank innumerable people for informative discussions on climate change. 


Responsibility for errors is mine. 


? Notes: 1. Details will evolve, but the climate issues presented here will exist for a long 
time. 2. I often just write “climate” for “climate change". 3. This chapter can mostly be 
read independently of the others. 
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For these reasons, I think it is critical to adopt climate risk management °. 

I propose a formal structure for climate risk management, and also emphasize 
positive opportunities occasioned by mitigating climate change. I propose two 
climate risk metrics: “Climate Change Value at Risk” and a “Climate Change 
Reward-to-Risk Ratio”. I discuss an ethically based negative discount rate for 
valuation of future climate impacts, if we do not act responsibly on climate now. 
While focusing on business and finance, I also give a background overview. The 
material is of necessity condensed because climate is complex. The outline is: 


Climate Change Risk Management — Formal Structure 

Finance, Economics, and Climate Change Risks 

Business, Investors, and Climate Change Risks 

Economic Models Related to Climate Change Impacts 

Chapter Wrapup and Epilogue 

Background Overview of Climate Change (Appendices I — ГУ)“: 


IL | The Physical Science Basis 
Il. Impacts and Vulnerabilities 
Ill. Risk Management Mitigation and Adaptation 
IV. Risk from Climate Contrarian Obstruction 


Three Levels of Approach to this Chapter 


Level one is the text, a runnning summary. Level two is attached detail in the 
footnotes (e.g.°). Level three is reference in-depth material®. References are the 


? Climate Change Risk Management: We hope for the best; risk management advises 
preparation for bad possibilities. Climate risk management is about mitigation (action of 
amelioration or abatement) and adaptation (adjustment) for managing the risks of climate 
change, discussed in this chapter. It is applicable locally, nationally, and globally. 

Climate change risk management is increasingly becoming the framework of choice, 
enabling rational discussion. 


* Background Overview: The appendices summarize the complex multifaceted topic of 
climate change, with references. This overview follows my basic talk on climate change. 


? Climate Change, Weather, Global Warming: Climate change is a general concept. It 
involves longer time scales (e.g. 30 years) than weather, which has substantial short time 
scale fluctuations (noise). Global warming is the general upward temperature trend of 
climate change observed since around 1975 (although the details are complicated, 
involving fluctuations of extra heat going into and coming out of the oceans; see App. 1). 


* Sources, References, Links: Besides the IPCC reports and other primary sources, I 
used secondary sources with information from primary sources. Some sources for climate 
impacts, mitigation, and adaptation are more informal than climate science. 

References are extensive and cover each topic, with a plethora of details. They are 
useful resources. There is however no pretense at completeness. Some references are 
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authoritative IPCC reports”"" from primary climate science sources*”, climate risk 
sources, and many others (e.g."), with Internet links when available (see 
especially RealClimate.org and SkepticalScience.com). 


My personal history related to climate is here! ". 


more easily understood than others, some are specialized, and some have general 
information. Some references contain quantitative information for climate data, energy 
data, climate scenarios, and climate impacts. References may be paired with footnotes. 
Many references are after the IPCC ARS cutoff dates. The reference cutoff date here was 
2/1/2015. Internet links are current as of this writing, but may change. 


7 The Intergovernmental Panel on Cimate Change (IPCC) Reports: The IPCC 
periodically produces three voluminous reports of assessments and syntheses of peer- 
reviewed academic and laboratory research, plus “selected non-peer-reviewed literature", 
with references. The IPCC does not perform research. The IPCC process is extensive and 
transparent with heavy peer review. Assessments are conducted by recognized experts 
who volunteer. The 2013-14 AR5 reports cover the literature up to 2012-13. The previous 
ARA reports were in 2007. The “Summary for Policymakers” (SPM), as a unique case, 
requires approval by all countries, and so is conservative. I recommend the three IPCC 
Technical Summaries for a professional introduction. They are free to download. 


* What credentials qualify a *Primary Climate Science Source"? For this book, 
credentials for a primary climate science expert are: 

(1) Holding a science PhD and a staff position related to climate at a university or 
laboratory, and 

(2) Having research published recently in respected peer-reviewed climate journals. 
These are strict criteria, generally applicable to the IPCC science report. 


? Secondary Climate Science Expertise: I define secondary climate scientific expertise 
as being able to understand and accurately communicate at least some primary climate 
science source material. Secondary climate science expertise requires technical ability 
PLUS considerable study, and (in my experience) is not easy to achieve. Secondary 
climate expertise is not generally sufficient to critique primary source climate material. 


10 Risk Expertise: Risk expertise is critical to understand climate risk management. 
There is “primary” risk expertise (e.g. working as a quant or risk manager in the industry) 
and "secondary" risk expertise (e.g. learning through individual study). 


" Personal History (Science, Climate, Risk): My science credentials include post-PhD 
academic/research physics positions for over 15 years, with over 60 papers published in 
scientific journals. I have studied climate since 2005, attended the 2009 Copenhagen 
Climate Conference, am a Climate Science Rapid Response Team Matchmaker 
connecting scientists to journalists, and the Managing Editor of the UU-UNO Climate 
Portal website (Unitarian Universalist United Nations Office). I have been working in 
quantitative risk management for longer than I care to remember. 


Stories: There are a few personal stories in footnotes, as in the rest of the book. 
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Climate Change Risk Management — Formal Structure 


Climate risk management in principle should follow the general logic of risk 
management. Here is a formal “recipe” structure for climate risk management". 


1. Identify a future hazardous situation 5 or event, or of a class of such 
situations/events, due to climate change 

Estimate the exposure(s) E, assuming the situation S occurs 

Estimate the vulnerability V to the exposure 

Estimate the impact J of exposure Е, given the vulnerability V 
Calculate the risk R of the impact / 

Decide on “acceptable levels” of risk R due to impacts (hazards) 

If possible, estimate the probability P of S occurring 

Estimate the cost/effort C to reduce the risk R to an acceptable level. 


CO ON te et 


I define the risk R of a hazardous situation S$ in theory as the product of 
(the probability P that the situation occurs) times (the impact /). The impact / 
corresponds to the exposure Е for vulnerability V , and gives the loss if the 
situation occurs. If it is not possible to estimate the probability P, the situation is 
called a "scenario", V and the risk R is then taken as the impact / of exposure 
E to the scenario (essentially the conditional risk assuming the situation 
occurs)? : Resiliency”, hazard, and severity enter also'^. 

The cost C of risk reduction depends on the mitigation action taken to 
counter the risk, and/or the cost of adaption to the risk. 

As with finance, there is risk that can be extracted from history and risk that 
can be projected using models into the future. 


? Climate Risk Recipe: The IPCC WG2 report discusses impacts, vulnerabilities, and 
exposures (ref). The structure above puts climate risk management in a recipe format. 
Usually items are discussed individually. Putting everything together in a framework 1s 
ambitious. At present, due to complexity, it is possible to achieve this climate risk 
management framework only incompletely and qualitatively. 


? Comment on Exposure: Note that exposures E can depend on the vulnerabilities V; 
higher V can imply higher E. Poor people with high vulnerability may live in places with 
high exposure, such as living near a seacoast. Business exposure is complicated, for 
example involving supply chains dependent on facilities vulnerable to climate impact. 


" Resiliency, Hazard, Severity: 


(1) Resiliency is a goal for actions to decrease vulnerability to climate change. 
Resiliency results from preventative adaptation, and is increased with mitigation. 


(2) A hazard is a potential negative impact. 
(3) Severity measures the degree of impact. High severity means a bad impact. 
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Risk management in finance is concerned with hedging or ameliorating 
adverse potential effects, consistent with the “risk appetite”. Risk management 
for climate has a similar philosophy. 


Why is climate change a problem? Haven't we always had disasters? 


There have been disasters since the Dawn of Time. However climate change can 
increase the probability and severity of disasters. 

As discussed in Appendix II, climate change impacts are mostly negative, are 
being observed now, and will become far worse in the absence of substantial risk- 
management climate mitigation with a BAU (Business as Usual) scenario where 
we do not do much on climate mitigation". Given a BAU scenario, climate 
change will increase the probability and severity of impacts negatively affecting 
businesses, the economy, and society. So although the present generation is 
already feeling some climate impacts, these impacts are only a faint rumbling of 
the climate impacts that will hit our descendants if BAU prevails. 

Under BAU, in the future there will be no place to hide. 


Estimates, Ranges, Uncertainties, and Risk Management 


The word “estimate” means that uncertainty is present and ranges of possible 
values exist. All quantities (S, Е, У, I, А, P,C) are complicated, depend on 
multiple underlying variables and expert judgment, and have uncertainty ranges 
that may or not be possible to ascertain quantitatively. 

Uncertainty and risk management go together. Risk management in a 
concrete sense is action to deal with uncertainty as best we can, including climate 
change risk. Uncertainty does NOT mean “no risk". 

What we do about various risks depends on the risk management goals, the 
"acceptable levels" of risk (risk appetite), and the costs to accomplish risk 
management. Everything has some uncertainty, and actions are taken around the 
world every day without certainty. BAU, which avoids action on climate change, 
ignores principles of risk management used in fields from aviation to finance. 


Trends of Risk for Climate Change 


It would be erroneous to think climate risk is only about uncertainty. Much 
climate risk is, for all practical purposes, quite definite — e.g. under BAU, future 


P What is *BAU"? Are we doing better than BAU now? Business As Usual (BAU) 
indicates that not enough action exists to lower greenhouse gas emissions to mitigate 
climate change in an appreciable sense. There are many positive efforts now, so current 
mitigation is certainly above BAU, and the number of positive mitigation efforts is 
definitely increasing. However today's mitigation is not at all sufficient to achieve the 
Copenhagen Conference goal of temperature increase limited by 2 degrees C rise by 
2100. So I simply assume BAU as currently operational. See App. III. 
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climate risk on average and in the tail will be greater than now. It is quite definite 
that global warming exists and that humans are the main cause (App. J), quite 
definite that global warming will produce increasingly harmful impacts (App. II), 
and quite definite that we can mitigate climate impacts (App. III). 

These statements define trends of risk for climate change. Uncertainty exists 
around the trends. “Global warming exists” is a trend, with uncertainty of the 
amount of future global temperature rise. Climate change impact risk has a trend 
- impacts will be increasingly bad — and an uncertainty of “how bad” around the 
trend. Climate change mitigation has a trend of positive action — and an 
uncertainty of how much mitigation can be accomplished. 


Climate Change “Tail Risk" 


In assessing risk, it is essential to consider hazardous "tail" situations - “unusual” 
events, because that is where the really bad risks R lie with high impact'*. 

Prof. К. Emanuel (MIT) posted ™ an instructive estimate of the probability 
P distribution of global mean temperature increase resulting from a. doubling" 
of СО», made from 100,000 simulations using a climate model xxvi. fun with 
combinations of parameters varied across plausible ranges. 

The model-produced temperature increase has uncertainty, quantified by the 
graph. The tail of high risk is in the region of high temperature increases. The 
main non-tail risk is in the central region with little temperature change. This is 
an excellent example of responsible reporting of the probabilities of events with 
different risks. We cannot tell what the risks are until we estimate the exposure 
E for a given temperature increase and other attributes of climate change. 

The real tail risk is that climate change in a BAU scenario will move the 
planet out of equilibrium, into new uncharted territory that the human race has 
never experienced. Tail risk includes “abrupt climate change” that can have 
devastating consequences". 

An excellent animation" from NASA shows how the distribution of Northern 
Hemisphere summer temperature anomalies has shifted toward an increase in hot 
summers. 

Here is an illustrative figure of present and future risk from climate change, 
including the change in tail risk (cf. the 2001 IPCC figure for extreme weather ^): 


^ Stressed VAR analogy for tail risk management: In earlier chapters I discussed 
Stressed VAR for risk management of financial tail events (market crashes). We need to 
address tail risk for climate. Ignoring climate tail risk is like ignoring market crashes. 


17 Doubling CO, : This is a common scenario. The benchmark is the pre-industrial level. 
Sometimes other greenhouse gases are included and converted into СО» “equivalents”. 
Other scenarios are also used (below). 
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Bigger Climate Risk in the Future and Climate Risk Management 


RISK INCREASES IN FUTURE FROM CLIMATE CHANGE 


=== fisk Profile Today === Risk Profile in Future 


Bigger average 
future risk. 


Low risk today 
with high 
probability 


- 
p. 
- = 


PROBABILITY OF RISK -> 


RISK -> 


FIGURE CAPTION: Illustrative future climate risk profile (dotted curve on right with 
high average risk) vs. Today’s risk profile (solid curve on left with lower average risk). 
“Today” means: “roughly current as of 2014". “Future” means in 50-100 years. “Profile” 
means the probability distribution function. Future average risk is greater than today’s 
average risk; this is the trend of the risk. With high probability, all of today’s risks are low 
relative to future risks. Future risk uncertainty is bigger than today’s risk uncertainty, with 
huge future tail risk (dotted curve’s right side). Today’s tail risk (right side of solid curve, 
e.g. Superstorm Sandy) is quite small compared to future tail risk. 

Similar graphs apply for climate impacts, vulnerabilities, and exposures. See App. II. 

Climate risk management is concerned with the definite average increase in risk, with 
the uncertainty in risk, and with very big future tail risk - all due to climate change. 


The same graph applies for positive mitigation opportunities, now and future. 


Climate Change Non-Tail Risk 


"Non-tail events" generically are the “соттоп” events. They generally have less 
impact and are easier to manage, although with high vulnerability V a non-tail 
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event can have high risk. Negative climate impacts can occur with non-tail risk; 
"grinding away" can also have deadly impacts. An example is the steady increase 
of food prices from increased drought, changed precipitation, increased insects 
moving north, and increased invasive plants that negatively impact food supply. 


The Cost of Climate Change Risk Management 
The cost C of climate change risk management includes the cost of climate- 


change mitigation C 


Mitigation " A portfolio of mitigation options includes action by 


individuals, corporations, Non-Governmental Organizations (NGOs), and 
governments at all levels (local, state, national, and international). There will 


also be a cost С, dipton for adaptation, including reducing vulnerability V. The 
costs aodio and C, dion depends on the history of previous efforts. Thus the 


cost that will need to be paid by our descendants if we do not act is likely to be 
far higher than that we would need to pay now to alleviate climate change. The 
effectiveness of mitigation to alleviate climate change depends a-priori on the 
amount of human-caused climate effects into the future. 

One main argument in this chapter is that positive opportunities encountered 
in mitigating climate change may be larger than the costs. 


Attitudes on Climate Change (U.S.) 


The attitudes of people in the U.S. toward climate change are measured in 
comprehensive surveys by Yale University *. Six categories are identified". 


A Simple Analogy for Climate Risk and Climate Insurance 


To help understand the above concepts, consider the mundane example of risk 
being in a car accident. The quantities (S, E, V,T,R, P, C) can be: S (bad 
event) = Accident, E (exposure) = Being on the road, V (vulnerability) = Higher 
if it’s rush hour in the rain, / (impact) = Damage if an accident occurs, P 
(probability of accident) = Statistical estimate using history, R(risk) = Cost of 
damage. Depending on risk tolerance, we can envision different actions: C (cost 
of risk action) 7 driver's education (mitigation) or car insurance (adaptation). 
Note we buy insurance not because we want to use it but because we may 
have to use it. Insurance costs us something, but not having insurance can cost us 
much more. There are two types of auto insurance: (1) Accident insurance (bad 
tail risk) and (2) Comprehensive insurance (ordinary risk). Accident insurance is 
mandatory because a bad accident is where the really bad risk exists. Even 


16 The “Six Americas” attitudes on climate (2012): “Alarmed” (16%), “Concerned” 
(29%), “Cautious” (25%), “Disengaged” (9%), “Doubtful” (13%), *Dismissive" (8%). 
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though the probability P of a bad accident for a driver may be small, the impact 
loss/damage for a bad accident is high, producing high risk R”. 

The risk reduction cost C is the total of mitigation and adaptation costs. The 
example shows that a portfolio of different types of risk reduction is useful. 

Of course, uncertainties in this mundane example abound. That does not stop 
people from prudent risk management, and the law sensibly requires it. 


Climate Insurance vs. Standard Insurance 


The concept of “climate insurance” is somewhat complicated, including climate 
mitigation action, but the basic concept is similar to car insurance “" 

Insurance (reinsurance) companies in the business of extremis weather 
catastrophic insurance have a data-driven hard view of climate risk. Extreme 
weather insurance is a standard type of insurance, essentially like car insurance." 

"Climate insurance" is more complicated than standard insurance or simple 
extreme weather insurance. The concept relies on the fact that there are two 


classes of people (those living к, and descendants {xt Call 


Іжа?) the climate mitigation costs paid by those living, to benefit descendants 


by reducing future climate change impacts / . 

If annualized, mitigation costs could be considered a "climate insurance 
annual premium". There is no "insurance company" to pay out in case of an 
“accident”. The “accidents” are, in this imperfect analogy”, future climate 
impacts. The “payout” is measured in terms of avoided future negative climate 
impacts. The “risk transfer" is from descendants (fewer climate impacts) to those 
living (expense of mitigation). Those living pay their descendants indirectly for 
them to deal with future climate change impacts by lowering these impacts". 


? Insurance Premium: Although the probability of a bad accident theoretically depends 
linearly on the annual mileage, only a small premium discount is given for low mileage. 


? Insurance - Good but Imperfect Analogy: Insurance is an imperfect analogy to 
promote understanding. Insurance pricing is on average claims plus a spread over a 
diversified set of risks and policy holders. Climate change is not a diversifiable risk in 
because the real risk occurs in the risk tail which will produce widespread impacts. This 
is like a portfolio with a concentrated lumpy risk. That is, car insurance is for expected 
losses, whereas climate risk requires a risk premium. There is however diversification 
from a time standpoint considering both those alive now and descendants. 

Pindyck has written “one can think of a GHG abatement policy as a form of 
insurance: society is paying for a guarantee that the low-probability climate disaster (or 
its economic impact) will not occur...” (cf. ref.) 

One limitation is that there is no positive opportunity for mitigation in the analogy. 
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Here is an illustration of indirect payout for climate risk transfer (cf. above): 


{хе} —> e) 
59 


Climate Change Action, Ethics, and Morality 


Part of future adaptive cost C Adaption cannot easily be measured in monetary 


terms. A cost especially under BAU will be in human suffering. There are 
significant ethical and moral issues?!” *^ ^ ^ Here is one statement: 


*! Climate Change Ethics and Morality: Three Issues: Climate change is the greatest 
ethical and moral challenge of our time. Below are the three main issues, all interlinked 
with climate. Long-term solutions will require mitigating climate change substantially. 

(1) Intergenerational Equity and Sustainable Development: The quality of life of 
our descendants will depend on what we do or don’t do for climate mitigation. 
Sustainable development is defined as “development that meets the needs of the present 
without compromising the ability of future generations to meet their own needs” (ref). 

(2) Environmental Equity, Environmental Justice, Poverty, Pollution: The poor, 
who do the least to cause the problem because they produce little greenhouse gas 
emission, will suffer the worst because they have few resources for adaption and are the 
most vulnerable. Women and children in poor countries are especially vulnerable, as are 
small farmers. The eight Millennium Development Goals to eradicate poverty (and to 
which all UN member states are committed), and the follow-on Post-2015 Development 
Agenda, are compromised by climate. 

Pollution issues include environmental and health impacts from coal mining, coal- 
fired power plants, and many other pollution sources. Pollution is connected with 
environmental justice and poverty, because pollution tends to be worse in poor areas. 


(3) Energy Parity Equity and Population: Energy parity equity means the goal of 
equal world-wide average per-capita energy resources. The climate problem is that 
developed countries have already largely used up the CO» budget available for a livable 
planet, so future economic development of underdeveloped countries will be 
compromised under a climate-mitigating carbon budget limiting fossil fuels, unless non- 
carbon energy sources are developed. If so, underdeveloped countries would “leapfrog” 
over fossil fuels into the new renewable world. 

Population in underdeveloped countries is much larger than in developed countries, 
and this population is rapidly increasing. Increasing population exacerbates the energy 
parity issue - more people need more energy. However at present, per-capita use of 
energy is much lower in underdeveloped countries than in developed countries. 


? Intergenerational Equity, Our Legacy: What will Future History Books say? Lest 
intergenerational equity seems theoretical, I am talking about climate risk management 
for the well-being of our grandchildren and future descendants. If we do all we can 
(consistent with our own reasonable well-being), they will thank us. If we do little about 
climate change and they are hit hard, what do you think they will say? What will our 
legacy be? What will future history books say about us? Our choice. 
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“Humane effective responses to global warming with an ethical and moral 
foundation require difficult equitable resolutions of conflicting national 
situations generated by different per-capita emissions (historical, current, and 
future), economic development, and energy requirements. Nevertheless the 
interests of all are intertwined... Let not future generations, impacted by global 
warming, say of us, ‘They knew but did not act’.” (Refs. ***>) 


Finance, Economics, Risks, and Opportunities 


Stranded Assets and the Carbon Bubble Risk 


In order to prevent climate change from causing unacceptably severe and 
unmanageable impacts, the 2009 Copenhagen Accord set a limit of 2 degrees 
Centigrade (2? C) for global warming?* <“, A profound consequence follows if 
we stay within this limit - namely a substantial amount"! of carbon that could be 
extracted and burned cannot in fact be extracted and burned. Carbon left in the 
ground is a useless asset that is stranded (in the ground) and therefore worthless. 


? Economic Utility Theory: In principle economic utility theory can be used to quantify 
future suffering due to climate impacts. In practice, I am not so sure how realistic this is. 


?' The statement in italics is from “Climate Change: Summary and Recommendations to 
Governments", by the Non Governmental Organization Committee on Sustainable 
Development (NGOCSD). The paper was distributed to negotiators and officials at 
Copenhagen in 2009, and at subsequent UNFCCC Conferences. Many NGOs contributed 
to the paper. It is updated each year. I am the editor. The NGOCSD is in the Conference 
of NGOs in Consultative Relationship with the United Nations (CoNGO; ref). 


Many fine moral/ethical statements motivating climate action exist (refs). 


2 Story: At the Copenhagen Conference I got into a session including the Prime Minister 
of Denmark, to whom I wanted to give a copy of the NGOCSD climate paper. After the 
panel, I got to the stage just as his legion of bodyguards was closing in, and managed to 
give him the paper. He put it in his suit pocket and said he would read it. 


20 The 2°C target of the Copenhagen Accord: This target is an increase in average 
surface temperatures above pre-industrial levels consistent with an estimated reasonably 
low probability of unacceptable climate impacts. Note that two degrees Centigrade rise 
equals 3.6 degrees Farenheit rise. This target should be supplemented with other scientific 
indicators, notably the total energy being retained by the planet (including the oceans). 


27 How Much Stranded Carbon Exists? The amount of stranded carbon depends on the 
target temperature rise by 2100. Given a target — e.g. 2 degrees Centigrade — the amount 
of fossil fuel consumption can be estimated. 5096 of stranded carbon is a useful 
benchmark; however much higher numbers for stranded carbon have been advanced. 
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Theoretically, such stranded carbon assets should negatively affect stock prices 
of fossil fuel carbon-based corporations and others correlated to them. 

Since stranded assets are not priced into current stock valuations, stocks 
affected by stranded carbon assets are “artificially” high (the “Carbon Bubble”)*. 
The carbon bubble constitutes a significant risk to finance and the economy". 

There are two logical possibilities. Either (1) substantial action will be taken 
to mitigate climate risk by forcing carbon assets to be stranded, or (2) no 
substantial action will be taken. In case (1), eventually (though with unknown 
time scales) the carbon bubble will burst, and carbon-dependent stock prices will 
fall. In case (2), huge climate impacts on financial and economic systems may 
occur, and all stock prices could eventually fall". Indeed in this latter (BAU) 
case, perhaps we won’t care because we may wind up so desperately occupied 
with basic survival issues. 


ae 30, xxi 
There are also other complications??? ™. 


Changing Oil Industry Position on Climate Risk; Reputation/Legal Risks 


The oil industry position on climate risk 1s in flux. Historically, the oil industry 
(notably ExxonMobil) promoted contrarian disinformation (App. IV). Their 
current position acknowledges climate risk, possibly (I conjecture) because of 
potential legal and/or reputation risks? "^" However this is accompanied by 


25 How Much Will Fossil Fuel Stocks Fall if the Carbon Bubble Pops, and When? 
Given the fraction of stranded assets, the change in the stock price of a corporation in the 
fossil fuel sector can be calculated. Bloomberg LP has a Carbon Risk Valuation Tool. For 
example if the ratio of stranded assets to total assets is 50%, the stock price is 
approximately overvalued by a factor of 2, because half the assets are worthless. 

There is no statement about the time scales over which the stock overvaluation 
correction may occur. However the coal sector has recently devalued, and oil prices have 
been declining. Also, there may be a “collective crowd" effect to “get out of the stock 
early before the bubble pops", compressing time scales. 


? Falling oil prices, Carbon tax: Commodity dynamics are complicated. If oil prices 
fall instead of rising, climate mitigation is compromised because oil demand increases. A 
carbon tax/fee would raise oil prices. At the nominal equivalence (in USD) of $1/tCO; = 
1¢/gal of gasoline at the pump, restoring a $1/gal drop in gas price is equivalent to a 
carbon tax on oil of $100/ (СО. Cheap oil increases pressure on shale oil, which is 
expensive to extract. 


? Possible negative correlation between fossil fuel commodity prices vs. stocks and 
extra induced market instability: An unusual negative correlation could exist due to the 
externality of stranded assets and climate mitigation policies. Falling fossil fuel stock 
prices due to the carbon bubble popping, coupled with future rising fossil fuel commodity 
prices from climate mitigation, would have negative correlation. Usually a positive 
correlation exists between stock price changes and asset price changes. 

Future correlation values could be unusually volatile, swinging between positive and 
negative. I think this correlation volatility could induce extra market instability. 


3! Legal Risks: There increasing legal risks and litigation from climate change. See refs. 
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denial that stranded assets exist, claiming that the entire fossil fuel reserves will 
be needed to satisfy future energy demands, without acknowledging the 
connection with unacceptable climate impacts and assuming no significant action 
will be taken to prevent burning these reserves; cf. ^^. 


The Social Cost of Carbon (SCC) 


The social cost of carbon or SCC "" is a measure of the cost to society for the 
negative impacts of carbon. SCC is the marginal cost due to climate impacts per 
ton of CO,**. Quantifying SCC is done with scenarios of impact estimation. SCC 
estimates vary widely, depending on the assumed discount rate, whether or not 
climate impacts are taken to influence productivity, etc. 

The Daniel-Litterman-Wagner model (ref) finds that the price of carbon 
emissions is high now and should decrease over time as impacts of climate 
become more certain. This result contrasts with common strategies where the 
price of carbon emissions increases in time. 


Proposal for “Climate Change Value at Risk", $Climate VAR 


Future climate impact assumptions vary widely and are expressed as scenarios 
(App I). Some aspects of Value At Risk (VAR) (Ch. 26-30) can be used, though 
with notable technical differences. The similarity is that probabilities could be 
assigned to impacts, and the results expressed as loss at a given confidence 
level??*" Risk measured at a confidence level provides a similarity to VAR. 
The result would be a $Climate VAR, which could serve as a benchmark 


for the risk of impacts on business and the economy due to climate change and 
global warming, at a given confidence level. 


22 Units: The Social Cost of Carbon is measured in $08/СО, per ton of СО, emitted to 
the atmosphere or in $US/tC per ton of carbon C. A ton of carbon is equivalent to around 
4 tons of СО». An example is $60/tC = ($60/tC)*(tC/AtCO;) = $15/tCO; approximately. 


3 Uncertainties or Ranges of Estimates in Climate Change Impacts: Most of the 
uncertainties or ranges of estimates of climate impacts, as with temperature increase, are 
dependent on various scenarios of human behavior for mitigating or not climate change. 
There are also uncertainties in the climate model estimates for a given human behavior 
scenario. These uncertainties are carefully shown in the IPCC reports. 


% Independent VAR Work: Gabriel Thoumi tells me that he has independently written 
almost the same Climate VAR model (G. Thoumi, CFA, Calvert Investments, Inc; private 
communication). There are Value at Risk reports in environmental contexts, including 
natural capital at risk (refs). 
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$Climate VAR is a forward-looking risk measure since climate impacts are 
increasing with time. Technically $Climate VAR would be designed to 
measure risk over many years. This aspect is very different from VAR. 

Modeling here would involve a long-term simulation with climate impacts, in 
a sense similar to long-term real-world counterparty risk simulations with “credit 
defaults” replaced by climate impacts. This simulation would not use normal 
(Gaussian) distribution assumptions. 

A fundamental problem is that market value is often irrelevant. Market value 
needs a market, which does not exist for many illiquid assets that would be 
impacted by climate. Other estimates of value would be required. 

As with other modeling efforts, use of changes of economic indicators like 
the GDP would be necessary for simplification of risk”. 


Other Risk Measures for Climate 


Other risk measures for climate could be Earnings at Risk and Cash Flow at Risk. 
These are complementary to the $Climate VAR suggested above. 


Climate Change Value Adjustment (CC_VA) 


We discussed CVA and FVA in Ch. 31. A climate change value adjustment 
CC_VA for a portfolio of assets could be defined. The backward induction 
process could be applied with a terminal constraint (e.g. two degree C 
temperature increase for 2100). 


Positive Opportunities in Mitigating Climate Change 


Moving to new technology paradigms has the potential for positive opportunities. 
If our Earth ^ is to remain liveable, the world economy now dependent on the 
multi-trillion dollar fossil fuel energy sector must deliberately transition to using 
renewable energies, producing massive job creation for infrastructure building. 

USCAP х", a group of businesses and environmental organizations that calls 
for emission reductions, says: “In our view, the climate change challenge will 
create more economic opportunities than risks for the U.S. economy." 

From the real-money investor group CERES ““"": “Tackling climate change 
is one of America’s greatest economic opportunities of the 21° century". 


35 Acknowledgements: I thank Mark Fulton for a very informative discussion. 


°° Our Earth: We will come across this later іп a story about the character В. Earth. 
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The World Bank says”: “it is now clear that climate-smart development сап 
boost employment and can save millions of lives... it is clear that the objectives of 
economic development and climate protection can be complementary. ” 

Deutsche Bank says™: *...Investing in climate change can help stimulate 
economies". 


XXXi, ж 


Bloomberg L.P. Sustainability says ^: “...sustainable practices can be both 
good for business and good for our planet. " 


Positive Opportunity Rewards, Climate Change Reward-to-Risk Ratio 


There are positive opportunity $PO potential upside rewards in mitigating 
climate change ?*9^^ А “Climate Reward-to-Risk Ratio” CRR could be 
adopted to measure the efficacy of climate mitigation relative to climate risk. I 
take the numerator as the reward from climate mitigation (positive opportunity 
$ PO), and the denominator as the risk ($Climate VAR). So CRR is *^* 


*7 Analogy of Positive Opportunities for Energy Paradigm Shift with the Computer 
Paradigm Shift: In grad school I lugged data tapes and boxes of 80 column cards up the 
stairs to the mainframe to submit jobs for Monte Carlo simulations of multi-dimensional 
phase-space integrals. I read octal dumps for debugging. These days are long gone, 
although some mainframe guys did not go quietly (see footnote 1, Ch. 35). 

When the paradigm changed from centralized to distributed computing and the 
Internet, massive positive opportunities occurred and incomparably many more jobs were 
created than lost. There are new technology billionaires. 

Similar opportunities will occur with an energy paradigm shift. Today's fossil-fuel 
industry is like yesterday's mainframe industry. Yes we will need some fossil fuel for 
some time to come but we need to move deliberately away from fossil fuels to maintain a 
livable planet. Yes we still need mainframes for some computing but most computing is 
now not mainframe. The analogy I think is clear. 


38 Cost of Missed Opportunity, Negative Mitigation Costs: Missed opportunity by not 
mitigating — e.g. not investing in renewable energies - should be listed as a negative $PO 
cost. Downside mitigation costs are subtracted. $PO is a net number. 


? Concrete Example of Positive Opportunity: A good example is the REMI model 
calculation for a carbon fee with 10096 dividend, described elsewhere in this chapter. 
More jobs are created due to demand created due to dividend than jobs lost in the fossil 
fuel sector. For this example, $PO is indeed positive. 

With large scale climate mitigation, there will be winners and losers. There will be 
more winners and fewer losers over the long term from acting on climate. 


“ Adaptation and Resilience: The positive opportunities that will occur by increasing 
adaptation measures and increasing resilience should also be included. 


^' Actually calculating Climate Change VAR and Climate Change Reward-to-Risk 
Ratio: I have no illusions of the difficulties involved, but I also feel it's good to have a 
framework. Actual calculations would involve complex multivariable scenario analysis 
with huge data requirements. My purpose here is just to introduce the ideas. 
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Е $РО 
$Climate VAR 


CRR (53.1) 


We measure CRR without and also with climate mitigation. Without mitigation, 
CRR = Obecause then by definition $ PO = 0. With mitigation, CRR increases 
with $ PO > 0 and also increases with reduced risk $Climate VAR. 


The Time-Scale Incentive Problem 


Business, politics, and human psychology operate on short time scales. Climate 
mitigation and adaptation operate on long time scales, and require commitment 
long into the future. The incentives to act are thus impeded, even if there were no 
economic problems or direct obstruction (cf. App. IV). The challenge is to 
motivate immediacy and necessity of climate risk management to those in the 
current generation who have the capability of acting. The increasing impacts of 
climate change are now becoming more severe, which will help this motivation. 


The Free Rider Problem 


There is a “free-rider” problem" of benefiting from fossil-fuel consumption but 
ignoring the climate problem and/or shifting responsibility for acting to others". 


xxxii 


Long-term discounting for climate change impacts in the future 


The topic of discounting is extremely important in setting climate policy. Climate 
change perspectives occur over long time horizons of 50-100 years or more. To 
assess relative mitigation/adaptation/BAU strategies that vary with future time, 
the use of standard finance discounting of the marginal cash flows resulting from 
these impacts would seem like a reasonable procedure to compare results on an 
equivalent basis. Discount rates are critical in cost-benefit calculations, climate 
impact calculations, and climate mitigation calculations. High discount rates 
demotivate action on climate. 


? Other Possible Climate Change Ratios: The Climate Change Reward-to-Risk Ratio 
is “reward of acting" / “risk of not acting". Another ratio “reward of acting" / “risk of 
acting" could be defined, like a standard financial ratio applied to climate mitigation. 
Other benefit/cost measures exist. 


? Acknowledgement: I thank Bruce Bueno de Mesquita for correspondence on the free- 
rider problem and the time-scale incentive problem. 
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However there is no theory from a fundamental perspective, or an empirical 
practical perspective, for a definite long-term discount rate. In fact, there are 
good reasons to choose zero discount rate, or even negative discount rate. 

This section argues that high discount rates used in some climate economic 
modeling are neither reasonable from several vantages points, nor ethical. 


1. Market Perspective for a long-term Discount Rate 


The longest liquid existing rate - the 30-year U.S. treasury rate" - has varied 
widely, recently being 1.2% per year", far lower than its value in the 1980's. 

In reality, assumptions for market discount rates are tied up with business 
cycles and financial policies, which will depend on future climate impacts. The 
presence or absence of inflation, details of monetary policy, growth rates etc. lead 
to widely varying estimates of long-term discount rates”. 


2. Discount rates, Economic Models, an Inconsistency, and a Scenario 


Economic models tie discount rates variously to assumptions of growth, future 
productivity output increases, returns on investments, the Weighted Average Cost 
of Capital МАСС, and other economic rates". Such high discount rates make 
our dependents pay for credit risk of companies that invest today. 


2a. Economic Models Produce Many Possible Discount Rates 


S. DeCanio’s book Economic Models of Climate Change, A Critique ®™ makes it 
clear that wide ranges of discount rates appear from economic models, including 
the possibility of zero or even negative rates. 


2b. Potential Inconsistency of discount rates with unhealthy economies 


Standard assumptions are that climate impacts do not fundamentally change 
future conditions. Economic models essentially assume current and future 
economic states as “healthy”. There is a feedback problem that can render some 
economic modeling inconsistent. 

If we continue with BAU without much action on climate, climate impacts 
can lead to a change of economic state to “unhealthy”, producing fundamentally 
changed economic conditions for our descendants. Growth and productivity 


^ Long-term rates: There are a few very long-term interest rates (e.g. 100 year bonds), 
but these only refer to compensation to the buyers of these bonds for specific purposes. 


45 What about other long-term finance aspects like credit? Other issues exist. Should 
the increase of default credit risk due to increasing climate impacts be taken into account? 
If so, how? There are no market instruments like 100 year credit default swaps, and even 
if there were, these would only reflect risk from a current market perspective that today 
ignores climate impacts on a long term basis. 
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output may decrease or even stop. In that case the original economic model 
assumptions assuming a future healthy economy break down along with the 
model assumptions for discount rates; this is the potential inconsistency. 

There is a related trap. If we use standard model assumptions, discount rates 
high enough to demotivate climate action can ensue’. Subsequently, BAU takes 
over with consequent future huge climate impacts, and the economic model 
assumptions break down again. 


2c. Economic scenario leading to low interest rates 


A possible economic scenario has interest rates kept low by central banks in an 
effort to stimulate economies weakened by the effects of climate change. 


3. Decoupling Discount Rates from Market Rates; “Collateral” Analogy 


I now propose a model using a concept taken from recent finance market practice 
for swaps since the 2008 recession. Swap pricing decouples the discount rate (at 
risk-free OIS, near zero) from credit rates (at market levels); cf. Ch. 8. 

This decoupling in modern finance requires collateral assets to replace the 
risk that otherwise would be contained in the discount rate. The “collateral 
assets” in this suggested analogy would be the assets of our descendants ^". 

The idea is that we living have a virtual deal with our descendants. Climate 
policy leading to our investments today mitigates climate impacts for our 
descendants. Standard financial discounting applies to today’s calculations for 
project finance?. BUT no discounting should be used for calculating climate 
impacts on our descendants or benefits from our mitigation-aimed investments. 


46 Demotivating climate change action with high discount rates: Climate impact costs 
can be discounted with rates high enough to make future impacts to our descendants 
appear insignificant now. This motivates BAU — i.e. little action on our part — because 
"why should we spend money to mitigate climate now if climate impacts to our 
descendants are small’. In fact that is an argument that contrarians promote (App. IV). 


47 Collateral and Assets of Descendants: Those living of course own all existing 
collateral. The analogy means essentially those living are should hold some collateral in 
trust for descendants. The collateral value is the amount needed to make the discount rate 
close to zero, just as for swap pricing. 

Taking the analogy further, climate policy for such collateral could roughly play the 
part of a swap credit support annex CSA. 

In finance, with impaired collateral assets, the discount rate is increased. The 
proposal here, applying the idea to climate, runs the dynamics backwards. High discount 
rates in models can demotivate action on climate, subsequently producing impaired assets 
of our descendants by unabated climate impacts. 


^ Standard Business Rates for Project Finance: Investing that does not involve 
collateral uses a discount rate including the credit risk in funding the investment, which is 
standard business practice for project finance considerations. Ordinary investing 1s not 
affected; this is what investors will do in absence of inter-generational ethical 
considerations and in absence of climate policy. The argument here is the discount rate 
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4. The Ramsey Formula and a Modified Formula with Climate Risk 


А common economic modeling assumption uses the Ramsey formula 5%) for 


the “social discount rate" s$. There are two parts to s^"? : (1) A “Pure Rate of 
Time Preference ог PTP rate" у, апа (2) A fraction 77 (the marginal elasticity 


of utility) of growth g of per-capita consumption. Explicitly, 


у Катзеу) 


=у +15 (53.2) 


No unique model values exist for the parameters in Eq. (53.2). Indeed, 
Weitzman showed that negative Ramsey discount rates eventually occur over 
time in a hidden-state two-component stochastic growth process" ^. 


4a. The Total Social Discount Rate: Modified Ramsey formula with Climate Risk 


I next suggest a modification to the Ramsey formula for the social discount rate 


Eq. (53.2) to include climate risk explicitly. Denoting climate risk by R and 


Climate) 


“marginal elasticity of climate risk"??? by 7) , 1 define the total social 


(tot 


discount rate 5°, with a negative climate risk term — pg. as: 


5090 = gR) _ pere R (53.3) 


5. The "Ultimate Forward Rate" UFR 

The UFR is a long-term interest rate entering into regulatory capital requirement 
specifications for European insurance companies and pension funds ^^ “", The 
UFR is not designed as a putative discount rate for climate impact purposes. 


for deciding climate policy for the benefit of our descendants should not use the ordinary 
business rate. 


? Determining the Climate Risk Term: The determination of the marginal elasticity of 
climate risk would be made from the negative impacts on growth from climate risk. The 
assertion here is that the climate risk term can cancel out the nominal positive growth 
term without climate risk in the Ramsey formula. 


? Macro-Micro Growth Model? I believe the macro-micro model containing different 
dynamics at different time scales (cf. Ch. 50-51) could be applied to growth dynamics. 


5! The Ultimate Forward Rate UFR: The UFR has two parts, an assumed long-term 
average real yield plus an expected rate of inflation. The UFR value is determined via 
negotiations of insurance companies and regulators with the goals of stability and 
solvency of the industry (cf: Solvency II reference). Numerical algorithms connect the 
UFR (to 120 years) with long-term liquid market rates. I thank A. Litke for discussions. 
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6. A Negative Ethics-Based Discount Rate (“Paying Forward”) 


There is a much more fundamental problem for the discounted cash flow 
assumptions of standard finance than the absence of a well-defined discount rate. 
Numerical choices for discounting imply that there is a monetarily quantifiable 
asset or liability to be discounted. Some potential climate impact liabilities, 
perhaps the most important, cannot be quantified. These include potential mass 
suffering due to climate impacts (App. П). Hence, an ethical discounting rate 
including climate impact effects arguably should be completely decoupled from 
current market interest rates that do not include climate impact effects". 

In fact, given the increased impacts our descendants will face under BAU 
assuming the present generation takes insufficient action, a negative discount 
rate? could be the ethical choice from an intergenerational ethics perspective. 
This means that, for ethical reasons, we should make a penalty payment to our 
grandchildren if we — for whatever reason — refused to mitigate climate change 
seriously. Even if we don't care about future human suffering — which I hope we 
do — future generations will be forced to bear the monetary costs of our inaction 
or insufficient efforts on climate change today. 

Whatever our personal views on the matter, there is a strong argument to 
make for “paying it forward" — e.g. just like millions of families already save in 
anticipation of their kids’ college education. Essentially we take the money that 
we should have spent (but didn't) on climate action, perhaps include costs of 
additional future climate impacts, and give it to our grandchildren. 

This payment could be put into a trust for our descendants to cope as best 
they can with the increasingly expensive and destructive impacts of climate 
change, for which they had no responsibility. 

There is plenty of precedent for such a concept in modern-day political 
rhetoric. Politicians of all stripes—but especially more conservative budget 
hawks—like to talk about the deficits we will hand down to future generations if 
we don't do something now. Yet future generations are already certain to bear a 
burden in terms of finances, health, and quality of life from climate impacts: the 
question is how severe their burden will be. Therefore, now is the time to 
establish a mechanism to pay forward the costs of future climate change impacts. 


? Negative Market Interest Rates? Recently some countries have in fact had negative 
interest rates (IMF report, ref). Basically you pay the bank to hold money, like paying for 
a safe deposit box. Also negative repo rates sometimes occur for bonds “on special". 

Financial security pricing models are now being modified for negative interest rates 
on an industry-wide scale. Gaussian interest rate models are now standard in the industry; 
these have negative rates (cf. Ch. 43). 


5 Rates and Ethics. Given the future suffering due to future climate impacts left out of 
current economic calculations, I think that any present value calculation with any positive 
discount rate is not justifiable for dealing with the long-term well-being of our 
descendants. In particular, using a high discount rate 1s intergenerationally unethical. 
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7. Table of Discount Factors from today 2015 to 2100 (85 years) 


The table shows discounting effects with different discount rates (1* column) 
from now to the canonical scenario date for climate impacts 2100. The 2" 
column shows the discount factor™ that, for positive rates, decreases the future 
impacts viewed today, masking the problems of our descendants. The highlighted 
row with assumed rate 5% per year has discount factor = 1.4%. This means 
climate impacts felt in 2100 are reduced by the huge factor 1/0.014 — 70 as 
viewed today, relative to not discounting. Also note that a difference of only 196 
in the discount rate produces a highly significant factor of 2 in perceived climate 
impacts. Thus, discounting is absolutely crucial for climate policy. 

I also exhibit a negative discount rate (ethically compensating for our current 
near-BAU lack of action) of -1% per year in the top row, which amplifies 
impacts in 2100 to over 200% as viewed today. 


Time (years)=| — 85 
Discount rate 
-196 | 


0% 100% 


A — — — — 43s] 


3 
«| — уви 
596 1 


196 
8% 0.1% 


Climate-Change-Induced Economic and Financial Crises? 


I think that the ensemble of climate impacts, unchecked in a BAU scenario, can 
trigger widespread economic and financial systemic instabilities and crises. 
Economic/financial systems are interconnected and are in non-equilibrium states. 
Perturbations or disturbances can trigger large instabilities. 

The trigger for the economic/financial crisis of 2008 was the collapse of the 
housing market in the U.S. and overleveraging. While some expected the housing 
bubble to pop, the attendant worldwide destabilizing effects were not anticipated. 

A detailed case study of a large financial institution that grew and crashed, 
described excessive risk taking, inadequate risk controls, conflicts of interest, and 
a system that was (perhaps too) complex to manage and regulate (ref. *"). 

I think similar risk-breakdown effects in climate risk management could 
potentially enhance or cause an economic/financial crisis due to increasingly 


* Discount Factor: Defined here as exp[ - discount rate * time] with time = 85 years. 
The effects become much more dramatic if we project forward several hundred years 
(recall Shakespeare lived several hundred years ago). 
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severe climate impacts. The difference with 2008 is that if the global 
economic/financial system crashes, there will be no possibility of a bailout. 


Contributors to Climate-Induced Economic/Financial Crisis - Examples 


e Economic: Breakdown of supply lines and markets via increased political 
instabilities and the litany of other serious climate impacts (cf. Appendix II). 

e Energy: Fossil fuel sector collapse due to the stranded asset problem, but 
without sufficient renewable energy development to satisfy demand. 

e Financial: Over leveraging by our descendants for future borrowing to cope 
with climate impacts, with the impacts negatively affecting ability to repay. 

e Sharp Climate Instabilities: Should one happen again (they have occurred), 
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economic/financial instabilities would become far worse *". 


Business, Investors, Regulators, and Climate Change Risks 


Corporations and investors are starting to take notice of risks due to climate 
change. A multitude of business risks exists from climate impacts (App. II). 


Business Risks from Climate Impacts 


e Risky Business Report 2014 


Risky Business released its 2014 report**" with this statement: “The U.S. 
economy faces significant risks from unmitigated climate change. The Risky 
Business report presents a new approach to understanding these risks for key 
U.S. business sectors, and provides business leaders with a framework for 
measuring and mitigating their own exposure to climate risk.” 


e General Business Risk Considerations 


Climate risks to international supply line and infrastructure, and other climate- 
enhanced risks are critical "". 


Business management needs substantial education on climate change", 


55 Risky Business is a joint initiative of Bloomberg Philanthropies, the Office of Hank 
Paulson, and Next Generation. The 2014 report is on business risk from climate change. 


% Business Management —Education on Climate, Inertia: Education on climate will be 
key, including Boards of Directors, to achieve general business understanding and action 
on the risks of climate change. It is important to have a Chief Sustainability Officer. 
Some corporate management is composed of those who have risen to the top with 
yesterday’s technology, unconcerned with (and uneducated about) climate issues. They 
can be “male, pale, and stale” with inertia for changing a familiar business model to cope 
with new issues. They get paid for applying the old business model, using rules that they 
and their predecessors created, focused on short-term gains before they retire. 
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Corporate Action on Climate Change 
е ESG (Environmental, Social, Governance) Ratings and Indices 


ESG ratings‘ are criteria in ESG indices that rate business practices of 
thousands of companies worldwide. Bloomberg LP has an environment, social, 
and governance (ESG) data service. It allows investors to search and compare 
relevant ESG data on 5,500-plus public and private companies. 


e Climate Bonds, Green Bonds, Sustainable Energy Financing 


Climate-themed bonds are bonds targeted e.g. to low-emission transport and 
renewable energies, mostly investment grade. The Climate Bonds Initiative has 
an 8-point plan to promote the transition to a low-carbon economy. There are 
Green Bonds with principles developed by JPMorganChase and other 
institutions" increasingly being issued by municipalities and corporations. A 
securitized ABS deal by SolarCity was issued in 2013. The World Bank, 
European Investment Bank (EIB), Asian Development Bank (ADB) and other 


xlviii 


banks finance sustainable energy projects А 
e Renewable Energies, Corporate Planning, and a Price on Carbon 


Businesses are increasingly turning to renewable energies". The CDP (formerly 
Carbon Disclosure Project) reports that some companies, including major oil 
companies, are incorporating a price on carbon into long-term financial plans ! . 


e Center for Climate and Energy Solutions (C2ES), Business 
Environmental Leadership Council (BELC) 


The C2ES’s BELC " is “the largest U.S.-based group of corporations focused on 
addressing the challenges of climate change and supporting mandatory climate 
policy. The BELC is comprised of industry leading, mostly Fortune 500 
companies ". 


e The World Economic Forum Recommendation 


The World Economic Forum" urged policy leaders to step up efforts to tackle 
rising greenhouse gas emissions, saying: "А climate-smart mindset incorporates 
climate change analysis into strategic and operational decision-making." 


e The UNEP Finance Initiative 


Other corporate management 1s progressive, looks to the future, and adapts to new 
conditions if a significant threat to their current business model appears. 

There is sometimes a culture clash between young upward bound management that 
understands climate risk issues and old upper management that refuses to even try to 
understand. Change will occur when these management dinosaurs die off or retire. 

Companies in different sectors / regions are divided in their reactions to climate risk. 
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The United Nations Environment Programme (UNEP) Finance Initiative" is a 
“global partnership between UNEP and the financial sector... including banks, 
insurers and fund managers... to understand the impacts of environmental and 
social considerations on financial performance.” 


e The Natural Capital Declaration (NCD) 


The NCD" was launched at the Rio + 20 Conference in 2012", 20 years after the 
Rio Conference. The NCD has four commitments regarding “the eventual 
integration of natural capital considerations into private sector reporting, 
accounting and decision-making". 


e Cambridge Institute for Sustainability Leadership (CISL), Banking 
Environment Initiative (BEI) 


The CISL “works with leaders to develop practical solutions that reconcile 
profitability and sustainability and catalyse change in the wider system." The 
BEI works “with banks or corporates from anywhere in the world that have an 
interest in working together to advance sustainability." The BEI is associated 
with the CISL™. 


• World Business Council for Sustainable Development (WBCSD) 


The WBCSD" “is a CEO-led organization of forward-thinking companies that 
galvanizes the global business community to create a sustainable future for 
business, society and the environment." 


Investors and Climate Change 
e CERES 


CERES mobilizes investors, companies, and public interest groups to adopt 
sustainable practices. Its Investor Network on Climate Risk (INCR) collectively 
manages more than $11 trillion in assets. CERES said: “We cannot risk our kids’ 
future on the false hope that the vast majority of scientists are wrong". 


e Shareholder Resolutions and Climate Change 


Institutional investors (e.g. pension funds), faith-based investors, socially 
responsible mutual funds, and labor unions sometimes file shareholder 
resolutions": some are related to climate change. 

e Socially Responsible Investing 


Xx 


Socially Responsible Investing (SRI) '" is any investment strategy (e.g. 
renewable energies) that seeks to consider both financial return and social good. 


e Investors, a Price on Carbon, and Renewable Energies 


Chapter 53: Climate Change Risk Management 899 


Investors in renewable energies want to have clear signals of the price of carbon 
for business planning, including regulatory certainty. Uncertainty for carbon 
pricing discourages investment?" 


e Divestment 


Divestment of shares in the fossil fuel sector has become more common, 
combining financial risk management and moral grounds. Some universities and 
religious organizations**™ have passed divestment resolutions. 


Financial Regulation and Climate Change 
e Basel III and Climate Change 


Basel III ™" is “a global, voluntary regulatory standard on bank capital 
adequacy, stress testing and market liquidity risk". Basel III does not incorporate 
climate change impacts. The CISL and UNEP FI stressed incorporating systemic 
environmental risks that potentially threaten banking stability into Basel III (ref). 


e SEC Guidelines on Reporting Climate Impacts 


In 2010, the SEC published: “commission guidance regarding disclosure related 
to climate change" for reporting in business financial filings ™". 


Economic Models Related to Climate Change Impacts 


Climate Change, Financial Crises, and Economic Models 


Financial and economic models and standard analyses are basically linear 
equilibrium models including small perturbations. Linear models do not take into 
account economic instabilities that are non-equilibrium and nonlinear in nature 
and that can lead to crises”. Some economic models do not consider impacts of 


5 More certain carbon pricing and investing: Investors need to include the price of 
carbon into business planning. This includes tax credits. I once sat through a conference 
on private equity venture capital renewable energy investing and heard over and over the 
need for more certainty in these areas before money could be committed. The breakeven 
time (how long before one gets investment money back) is affected. 


55 Divestment: As of 12/2014, the United Church of Christ, the Unitarian Universalist 
Association, and the World Council of Churches took fossil fuel divestment steps. This 1s 
significant from an ethical standpoint. Many other organizations are stepping up action. 


= Crises, Economic Models, Unstable Equilibrium, Phase Transitions: Most 
economic models are equilibrium models, basically “in balance”. However a crisis is not 
an equilibrium state and cannot be described by equilibrium models. 
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climate change at all. Others do include climate change impacts, but not in a 
framework including the possibility of crisis generation? 9", 


Path Dependence and Boundary Conditions in Climate Impact Modeling 


Climate impact modeling (e.g. for SCC) in principle 1s path dependent in time. 
For example future climate impacts depend on what mitigation efforts have 
occurred before that time, technological advances, etc. In general Monte Carlo 
simulations need to be performed to incorporate path dependence. 

Moreover if economic boundary conditions are imposed at future time (e.g. 
temperature rise limitations, stranded asset constraints, etc.), the calculations of 
climate impacts become more difficult. 


Specific Economic/Climate Models 
e The Stern Review and Related Work 


The Stern Review on the Economics of Climate Change “© discusses the effect of 
global warming on the world economy. A significant extension by Dietz and 
Stern included climate impacts on capital and productivity, with large effects. 


e = Summary of impacts for climate change from economic models 


In physics, equilibrium has a definite meaning. For example, a ball on the top of a 
hill is in “unstable equilibrium" and a ball in a hole is in “stable equilibrium". A state that 
is in unstable equilibrium may seem like an equilibrium state until a large enough 
perturbation or disturbance occurs to destabilize it. If such large perturbations are not 
included in an economic model, the model will not recognize the possibility of a crisis. 

The economy is usually in unstable equilibrium. It can be — and has been - thrown 
into crisis. The transition into crisis resembles a "phase transition" in physics. My 
assertion is additional crisis impetus exists from climate change. 


% Finance Crisis Models: See Ch. 46 for a non-linear crisis model based on critical 
exponents and Ch. 51 for a currency crisis model (Mills and Omarova). 


*' Climate Economic Model Risk: This book has spent a lot of time discussing model 
risk. Similar comments hold for economic models with climate impacts. Models have 
uncertainties, both in their equations and in their inputs. This does not mean we should 
ignore models, because we need all the input we can get. 


© Type I, Type II Errors and Climate: Climate risk management says we should err on 
the side of safety when discussing models. The ethical choice is that it is better to make 
some “type II" errors (estimating too much risk and costing us some extra money now) 
than a big “type I” error (missing the really bad climate risks and making the lives of our 
descendants miserable). The true null hypothesis НО is that climate risks exist; a type I 
error is incorrect HO rejection. (Nb: this HO differs from the HO resulting from an 
interchange of the type I, II definitions used e.g. by N. Oreskes.) Type I errors are 
enhanced by a contrarian framework “requiring proof" of climate impacts (see App. IV). 
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DeCanio’s book critiqued economic models of climate change (supra). In 
general, multiple stable equilibria can exist with different discount rates, and 
without clear dynamics for selecting one in reality. Much economic modeling is 
done with restrictive assumptions that frontload the benefits to current society. 


Tol * collected economic model estimates for the welfare impact of climate 
change for assumed temperature increases, expressed as an equivalent income 
gain or loss as % of global aggregate GDP. Income was lost according to most 
models. See also Ward (ref). The results are in the IPCC WG2 report (SM, ref). 


e REMI Model Shows Positive Economics of a Carbon Fee and 100% 
Dividend 


Regional Economic Models, Inc. analyzed a carbon fee with 100% dividend to 
households (“revenue-neutral carbon tax"), proposed by the Citizen's Climate 
Lobby (cf. below). Economic stimulus for the U.S. economy occurs. Jobs are 
created because the fossil fuel industry is highly automated and employs 
relatively few people. Fewer fossil-fuel jobs are lost over time compared to the 
extra jobs needed to service the increased, diversified, local, labor-intensive 
demand from households receiving the dividend. 


Most households, including those with low and medium income, come out 
ahead, because their dividend is greater than their increase in energy costs 5^". 


e U.S. Environmental Protection Agency (EPA) Models 
The EPA assesses the economic implications of policies to reduce the greenhouse 
gas intensity of the U.S. and analyzes long-term climate scenarios; cf. ^" . 

e International Monetary Fund (IMF) Models 


IMF economic/climate modeling and comparisons to other studies are here 
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& REMI Economic Model: The REMI model uses standard perturbed baseline economic 
scenario analysis, with small perturbations. The “perturbation” means that the model is 
considered with the fee + dividend, relative to the model without the fee + dividend. 

Explicit climate effects are not included. This means the REMI model is conservative 
because extra positive effects exist from avoiding more severe climate impacts. 

Actually, “REMI model” integrates three separate models: The “REMI PIP’ 
dynamical model of subnational units of the U.S. economy, the “ReEDS” power 
generation model built by the National Renewable Energy Laboratory and ran by Synapse 
Energy Economics, and the Carbon Analysis Tool “CAT” carbon tax revenue model built 
from the Annual Energy Outlook from the EIA (ref). 


% EPA Economic/Climate Models: The Applied Dynamic Analysis of the Global 
Economy (ADAGE) model is a dynamic computable general equilibrium model capable 
of investigating economic policies at the international, national and regional levels. Other 
models: Моп-СО»› Projections and Abatement Models апа the Mini-Climate Assessment 
Model (an Integrated Assessment Model including human systems and climate change). 
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Wrap Up of Climate Risk Management 


We have соте a long way. “We still have a long way to go" '**. The book's 
introduction listed some unsolved problems, including climate change risk 
management, which I think is the single most important risk issue for finance and 
humanity. Potential risk exists for climate-change induced destabilization of our 
inherently unstable worldwide economic and financial systems. 

I discussed a Climate Change Value at Risk and a Climate Change Reward- 
to-Risk Ratio, along with a formal structure for climate risk management. 

Appendices I, II, III below summarize: (I) The scientific basis of climate 
change; (II) The myriad increasingly serious climate change impacts; and (III) 
Mitigation of climate change risks plus adaptation. Appendix IV discusses 
"climate contrarian" obstruction risk. 

The references have in-depth background and technical information. 


Epilogue 


What will be our legacy? The worst climate change impacts can still be 
avoided. There will be no safe haven if we do not act responsibly. We can be 
effective. We need courage. There are many obstacles. There are many 
positive opportunities. We have to be optimistic. We need to adopt Climate 
Change Risk Management, now. 


There is no other choice. 


Chapter 53: Climate Change Risk Management 903 


Appendix I. The Physical Science Basis of Climate Change 


Yes, Virginia, global warming exists and we are causing it ™ 


Observations 


The most relevant observation from data is post-1975 anthropogenic global 


warming, i.e. the surface temperature upward trend of climate change, due mostly 
to humans burning fossil fuels? The basic physics driving global warming has 
been known since the 19" century. Increasing amounts of carbon dioxide CO; 
leads to the planet increasingly trapping heat from the вип”. The last few decades 
have been the hottest since global temperature was directly measured (late 19" 
century), and the hottest in over 1,000 years according to proxy temperature 
data? ^ х, The average global temperature has risen to a level above most global 


65 А . s è Р . 
Data: Various data sources exist. See refs for some details, including quality assurance. 


°° Anthropogenic CO;: Excess atmospheric carbon in CO; is due to fossil fuel burning 
(not natural sources), as distinguished by isotopic analysis, and is unambiguous. CO; is 
now around 400 ppm, a level not seen for 800,000 years (generally under 300 ppm), and 
today’s high level is increasing rapidly. “ppm” = parts per million by volume, the unit of 
CO, measurement in the atmosphere. Currently humanity is emitting CO; at the rate of 
over 30 billion tons per year, equivalent to the weight of one large elephant per person 
per year. Per capita, developed countries emit far more CO; than others (including China). 


67 Greenhouse Gases GHGs: GHGs, along with the positive water vapor feedback that 
exists, are responsible for the recent observed global warming trend since 1975. CO; is 
the predominant GHG. Methane (per molecule) is a more potent GHG and will become 
more important if permafrost or frozen deep-water methane deposits melt significantly; 
this is not expected at least before 2100 (but uncertainty exists, so the problem might rear 
up earlier). Other GHGs exist with less impact. 


$5 The “Hockey Stick” (yes it’s there): Data analyses by many independent groups have 
concluded that the recent rapid global temperature increase and current temperature levels 
are unprecedented in over 1,000 years. The temperature graph resembles a “hockey stick” 
with the “blade” averaging observed rising surface global temperatures. The hockey stick 
is not necessary to understand recent global warming, but it is significant. Data are from 
sediments, ice cores, tree rings, corals, stalagmites, pollen, and historical documents. 
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temperatures since the beginning of human civilization?" This global 
warming trend is surrounded by fluctuations, and is not uniform 77>, 

Climate change in the past was driven by natural causes. Climate change is 
now primarily driven by anthropogenic causes", There is no other explanation 
with a physics-based foundation that is not in conflict with data. 


Scientific Consensus 


73,Ixxvi 


There is strong scientific consensus , among primary-source climate science 
experts and also scientific societies, regarding the above scientific facts. 


© We are forcing the climate out of Holocene Range: The climate has been stable in 
the last 10,000 years (the Holocene period) during which human civilization developed. 
Humans have increased CO, to produce temperatures now above 90% of historical 
temperatures in the Holocene (ref). This global *experiment" is without precedence. 


7 Global Warming, El Niños and La Niñas, the PDO, Chaos: The hottest surface 
years since 1880 were 2014, 2010, and 2005. Nine of the hottest ten years are since 2000. 

The global warming trend in land/ocean surface temperatures over the last 30 years 
(since 1975) is 0.16 to 0.18 °C/decade (GISTEMP, Berkeley, NOAA, HADCRUTA) and 
0.12 to 0.14 *C/decade for satellite data (RSS, UAH). These trends are significant, well 
outside 2 standard deviations of uncertainty. Trends from Cowtan's website (ref). 

The atmospheric global warming trend is not smooth, resembling a "staircase". Noise 
and oscillations can temporarily obscure the global surface warming temperature trend, 
including El Niños that increase surface temperatures and La Niñas that decrease them, 
volcanic or other aerosols that decrease them, and the 11 year solar cycle that has a 
smaller effect. The Pacific Decadal Oscillation (PDO) - because it oscillates — is not 
relevant for the long-term global warming trend (e.g. out to 2100). 

Some noise can be removed using the simple procedure of plotting separately: 
temperatures from years with: (1) El Niño, (2) La Niña, (3) Neutral. The temperature 
trend of each series after 1975 1s up, showing global warming more clearly. 

Atmospheric temperatures early in the 20" century show a rise, then a mid-century 
plateau, probably involving time-dependent aerosol concentrations. 

See Appendix IV for discussion of the recent so-called global warming “hiatus”, 
more accurately a surface slowdown in global warming since around 1998. 


7! Chaos and Climate, a one-line summary: Climate is mostly NOT chaotic but weather 
IS chaotic. Here is an analogy: North of the equator, June is warmer than January on the 
average; this is climate. Which days are warmer than other days in a given month is 
chaotic; this is weather. Complications at intermediate time scales include times of onset 
and durations of El Niño and La Niña, and PDO phase changes. See Realclimate.org. 


72 Attribution of Global Warming to Humans: (1) Attribution of global warming since 
around 1975 to human activity is clearly demonstrated using climate models, containing 
both natural and human causes. If only natural causes (sun, volcanoes) are included, 
models cannot reproduce temperature data even on average. If human activity is included, 
models can reproduce temperature data on average. See below for models. (2) There are 
fingerprints of human influence on global warming that do not depend on models (ref). 


73 What about Climate Science Consensus? Wide consensus (over 90%, possibly 97%, 
according to the Skeptical Science “consensus project” and multiple other sources) exists 
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Climate Models 


74, [xxvii 


Climate models and data provide equilibrium climate temperature 
sensitivity vs. CO, doubling; the consensus median sensitivity is around 3°C ^^ 
viii Climate scenarios ^ typically cover the next 50-100 years”. Climate impacts 
on spatial scales as small as cities can be projected". 


among climate science primary expert sources, that Earth has been warming and humans 
are now the main cause. Consensus does not mean agreement over every detail. Increased 
consensus exists with increased climate expertise. 


™ Climate Models: Climate models are essential for understanding climate change, 
what-if scenario analyses, and rational mitigation/adaptation planning. However we do 
NOT need models to see that climate change impacts are already here — see App. II. 


Climate models are based on physics and coupled mathematical equations for land, 
ocean, atmosphere, ice, vegetation, the sun etc; with some empirical parametrizations 
(e.g. clouds, aerosols). The equations include both natural and anthropogenic “forcings” 
of the climate (ref). The atmospheric temperature is an output, not obtained by fitting 
parameters. Models are successfully backtested vs. temperature data at continental scales 
over the last 100 years. Models reproduce temperature responses to volcano eruptions. 

Even with increasing sophistication, model results for climate sensitivity to CO; have 
basically remained consistent since the first models were created. This is important 
because it shows that a missing effect is unlikely to change overall conclusions. 

A model run produces temperatures that fluctuate with time, and so ensembles of 
runs are used, with different initial conditions. There are models of varying complexity 
used for different purposes. 

Model code is in FORTRAN and run on parallel supercomputers. Different 
laboratories have independent codes. Model uncertainties exist, are examined, and are 
clearly exhibited in reports. 

Climate impacts can be estimated using models at “regional” scales currently around 
100-200 kilometers (60-120 miles). Impacts at small spatial scales are hard to evaluate. 
Climate model resolution is rapidly improving, due to increased processor speeds and 
massive parallel computation. 

Models are in general agreement with observation. They do not get every detail 
correct (they are models, after all). An incorrect detail does not invalidate the model for 
those data that the model does describe, within the errors of the model and in the data. 

Expertise is needed to interpret models results and to ensure that models are not 
misapplied to situations for which they were not intended. 


7 Equilibrium Climate Sensitivity: The consensus view from models and climate data 
(both ancient and modern data) is around 3? C for the equilibrium climate sensitivity ECS 
with a central range of 1.5 — 4.5 °C, and significant outliers. “ECS” means СО» level is 
assumed doubled and the earth system runs to steady state equilibrium. A lower value of 
ECS only *buys some time" for the problems of worsening climate change impacts. 


76 Climate Scenarios and Ranges in Future Temperatures: With scenarios of future 
forcings based on human decisions (e.g. BAU vs active substantial mitigation), the 
earth's average atmospheric temperature can be forecast, including uncertainty ranges, to 
2100. All models show increasing temperatures by 2100 from bad to disastrous; the 
uncertainty depends on what humanity chooses to do or not do for climate mitigation. 
There are many scenarios. The new “RCP” Representative Concentration Pathway 
scenarios (in the 2013 IPCC science report) assume total climate forcings (ref) in 2100 
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Appendix II. Impacts of Climate Change and Vulnerabilities 


“Mother Nature always bats last "^? 


Impacts of climate change are overwhelmingly negative”. Impacts are being 
observed now, with real financial losses, and will become increasingly much 
more severe under a Business-as-Usual (BAU) scenario. Climate impacts will 
increasingly and negatively affect people and society, including business. 
Climate impacts fall into several categories", enumerated next. 


1. Physical Impacts of Climate Change 


Climate change under a BAU scenario will increase the severity and/or frequency 


of many types of impacts. See the IPCC WG2 TS (supra) and ?*“, 


e Ocean acidification?" 


relative to preindustrial times, in units of watts/(meter^2). The 2007 IPCC *A1" SRES 
scenario is Business as Usual BAU, “B1” assumes substantial mitigation, etc. Some 
scenarios assume a doubling of CO, . Others assume equilibrium atmospheric СО» at 
given levels, e.g. 450 ppm. For a given scenario, climate models are used to estimate the 
average temperature response and future temperature uncertainties. 

These scenarios do not include impacts from climate tail risk “tipping points" 
induced by climate. Since such tail risk 1s not zero, these scenarios are conservative. 


The scenarios do not try to “predict the weather" out to 2100, which is not possible. 


7 The Psychological Time Problem and Climate Change Action: Time is one of the 
most difficult psychological issues to overcome to motivate climate mitigation action, 
since people generally think in a short time framework. 

The future to 2100 is often used to frame the climate problem. This does not mean 
that we should ignore longer time scales. After all, Shakespeare lived 400 years ago and 
recorded history goes back several thousand years, so looking at climate impacts into the 
future should not be restricted to 85 years. Impacts will get worse after 2100 under BAU. 


78 Quote: R. Watson, “father of LEED” (Wikipedia). LEED (Leadership in Energy and 
Environmental Design) is a set of rating systems intended to help building owners and 
operators be environmentally responsible and use resources efficiently. 


79 “Tike dropping your watch". The earth and your watch are finely tuned systems. 
Perturbing a finely tuned system leads to bad results. When I give a climate talk I hold up 
my watch and ask “If I were to drop this, what is the probability that its functionality will 
increase?" With climate change under BAU the earth's functionality won't increase 
either. This appendix lists the increasingly severe problems; see references for details. 


80 Ocean Acidification: When СО» is absorbed in water, carbonic acid is produced. Due 
to increasing CO; the ocean is rapidly getting more acidic — a deterministic statement. 
Ocean acidification has strong consequences for damage to the marine food chain. 
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Stronger droughts 
Stronger precipitation/flooding events 
Stronger fires ^ 

More intense hurricanes and other extreme weather events 
Decreasing ice, snow packs, and glaciers'^ х“ 
Increasing sea-level rise (SLR)** 19" 

Temperature distribution distortion; high temperature records 


82,Ixxxiv 


84,Ixxxvi 


87,88,Ixxxix 


*! Stronger Droughts: With warming, more evaporation occurs, causing more strong 
regional droughts if conditions are not appropriate for rain, e.g. summer in dry areas. 


52 Stronger Precipitation Events, Flooding: Stronger precipitation events will occur in 
some regions, with more flooding. This is because warming increases evaporation. 
Warmer air holds more moisture, producing more rain if conditions are appropriate (e.g. 
winter). More moisture can produce more snow if cold and warm air masses collide. 
Warmer air over oceans from global warming can make storms worse in coastal areas. 


83 Stronger Fires: More precipitation caused by global warming in winter causes more 
underbrush to grow, which dries during the summer. Adding hotter conditions, increasing 
wind speed, and increased lightning produce stronger wildfires in summer. I grew up in 
California; the state is now experiencing wildfires outside of historical experience. 


** Stronger Hurricanes: Although there are competing factors, there is evidence that 
global warming is increasing the intensity of the largest hurricanes. Note a hurricane is 
always a tail event. Climate change is enhancing the tail risk. 


* Ice, Glaciers, Snow Packs retreating; Paleoclimate data warnings: Since recent 
global warming started, arctic sea-ice at a given season (e.g. summer) has been 
decreasing rapidly. Glaciers are retreating worldwide with rare exceptions. Antarctica 
consists of several regions and is more complicated. Ice dynamics (e.g. destabilization of 
ice shelves due to warming sea water; *moulins" of descending melt water lubricating 
glacier bottoms), produce glaciers sliding from land into sea without melting first. 

Snow packs that provide water are also in decline, for example the U.S. west coast in 
the Sierra and Cascade mountain ranges. 


Paleoclimate data (mid Pliocene around 3 million years ago) show that with 
temperatures 3°C above current levels - as is possible by 2100 - the West Antarctic Ice 
Sheet collapsed. The time scale for collapse is not known. Huge "pulses" of Antarctica 
ice-sheet melt in the past led to several meters of sea level rise per century (refs). 


#6 Sea-Level Rise (SLR) and U.S. Navy Scenarios: “Based on recent peer-reviewed 
scientific literature, the Department of the Navy should expect roughly 0.4 to 2 meters 
global average sea-level rise by 2100, with a most likely value of about 0.8 meter.” See 
ref. Some major consequences of SLR are listed in the next section. 


57 More high temperature records: The number of high-temperature records is 
increasing, and the number of low-temperature records is decreasing. There is a subtlety 
in measuring since it gets harder to exceed previous maxima as time progresses. Early in 
the record history it is easy to get a temperature record — the first temperature is by 
definition a record high. Recent high temperature records are therefore more significant 
than old high temperature records. 
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e Possible effects on the jet stream and “unusual weather 


Global warming is making some forms of extreme weather events more likely”. 
Attribution of extreme weather events to climate impacts is mostly statistical”! - 
somewhat like harmful effects of smoking. 


2. Impacts on Individuals from Climate Change 


The physical impacts of climate change have direct impact on human individuals: 
e Increasing food shortages ^^"^ 


** Temperature Distribution Distortion with time because of climate change 
(statistics): Roughly the statistical distribution in a variable like temperature can be 
thought of as Gaussian whose width increases and whose center moves to higher 
temperatures because of climate change. Hard data back this up. See figure above. 


89 Possible Global Warming influence on the Jet Stream, Snowmageddon, and 
Sandy: The jet stream is a chaotic weather phenomenon, a globally circulating high 
velocity air wind, north of which is the cold Arctic air. There are global warming 
destabilizing influences on the jet stream that may be increasing the extent of southward 
jet-stream meanders and blocking patterns. There are two suggested dynamical causes. 
(1) Global warming is heating up the Arctic faster than the rest of the globe leading to a 
smaller temperature difference across the jet stream and possibly destabilizing it. (2) 
Building up of warm water in the Pacific with heat transfer to the Arctic, possibly 
destablizing the jet stream. Both effects could be relevant. 

Hence, ironically, *Snowmageddon" and also Superstorm Sandy may have been 
exacerbated by global warming. 


90 More extreme weather events — why? The basic idea is that extra energy in the 
atmosphere due to global warming means more energy is available for the weather 
system to produce more extreme weather events than would occur naturally. 


°l Statistics, Climate Change Impact Attribution, Confidence Levels: Statistical 
confidence for a given type of outlier event cannot be obtained without a long time series. 
On the other hand, the number of ways that extreme weather can be manifested (a sort of 
“entropy”) is large. This means we do not need long times to be sure that global warming 
produces more extreme weather of different types. This issue also occurs for scenario 
analysis in finance. 

We do not need to know all the details, as we do not need to know all the details to 
determine that smoking tobacco is harmful. 

There is a signal-to-noise problem for individual event attribution. However the 
science of individual event attribution is improving (AMS report, ref). Also there are 
deterministic trends that do not require statistics — e.g. the global energy to enhance 
extreme events is increasing due to global warming of the planet. 

Note that extreme weather enhancement means that severe events become more 
likely, but not a new maximum at each occurrence. 

Since global warming / climate change is the global environment in which extreme 
weather occurs, the presumption must be that there is some global warming component in 
any extreme weather event unless information exists to the contrary. 


? Increasing Food Shortages from Climate Change, 3 reasons: 
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93, xcii 


e Increasing water shortages 

e Increasingly negative health and disease impacts 
: : 95, xciv 

e Psychological impacts 


94, xciii 


3. Business and Finance Impacts from Climate Change 


Business / finance climate impacts were discussed above (see e.g. Risky 
Business, supra). These include supply line impacts, financial/economic stability 
impacts, national security impacts, work productivity decrease’, and all the other 
impacts listed in this section that will increasingly affect business and finance. 


(1) Direct data relates major crop yields with the temperature increases expected 
under a BAU scenario by 2100 imply substantial yield reductions of many crops, in some 
regions on the order of 50%. Plant physiology shuts down growth above threshold 
temperatures. Other specific crop stresses include: 1A. Insect invasions from the south, 
no longer killed by cold weather, and for which crops are unprepared. 1B. Enhanced 
drought. 1C. Changes in precipitation. 1D. Increases in weeds / invasive plants that do 
well under global warming. 

(2) The ocean food chain is strained by ocean acidification and also by temperature 
increase which negatively affects phytoplankton; both effects result in fewer fish. 

(3) Speculators drive up food prices under shortage conditions, making food more 
expensive and putting some food out of reach for many poor people (with potential 
political destabilization). 

Future food shortages will be regionally dependent, and will get substantially worse 
as time goes on under a BAU or similar scenario. 


? Increasing Water Shortages from Climate Change: Under BAU, hundreds of 
millions to half the world population may suffer “extreme water scarcity’ by 2100 
(Hejazi, M. et al, ref). 


** Increasing Health Risk and Disease from Climate Change: Increases in disease due 
to global warming / climate change will occur, e.g. from insect vector carriers. Insects are 
migrating north due to increased temperature and are not killed off in warmer winters. 

The Lancet and University College London Institute for Global Health Commission 
concluded: “Climate change is the biggest global health threat of the 21" century. Effects 
of climate change on health will affect most populations in the next decades and put the 
lives and wellbeing of billions of people at increased risk.” 

From the American Public Health Association APHA: “Climate change is one of the 
most serious public health threats facing our nation". 


% Negative Psychological Impacts from Climate Change: The American Psychological 
Association paper (ref) has examples of the importance of psychological impacts from 
climate change. The main NGO representative of the APA personally emphasized this to 
me at the 2009 CoNGO Board meeting of UN NGOs in Vienna. 


?* Decreased Work Productivity from Increased Heat: People do not function well 
under heat stress that will become common in some areas, decreasing work productivity. 
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4. Societal Impacts of Climate Change 


97, xcv 


Impacts on national security 
Future mass climate migration 
Increase in local conflict, regional wars, and terrorism 
Weakened marginal state governments ^ < 

Coastal flooding; threat to low-lying island states 


98, xcv 


99, xcv 


101,xcvi 


5. Negative Climate Impacts on Other Species, Affecting Humanity 


102, xcvii 


e Likely species extinctions, negative impacts on biodiversity 


?' National Security and the Climate Threat Multiplier: At the 2009 Copenhagen 
Climate Conference, I attended a session at the U.S. Pavilion at which the Oceanographer 
of the Navy, Rear Admiral D. Titley, spoke. Representatives of the other branches of the 
U.S. Armed Forces and the Defense Department were online. They emphasized the 
serious nature of climate change in a national security context. See the U.S. Defense 
Department Quadrennial Reviews (QDRs) and other sources (refs). 


?* Future Mass Climate Migrations: A large percentage of humanity lives close to low- 
lying sea coasts. Depending on the amount of sea level rise and the local environment, it 
is anticipated that worldwide hundreds of millions of people may be forced to migrate by 
2100. The QDR2010 (ref): "Climate change will contribute to food and water scarcity, 
will increase the spread of disease, and may spur or exacerbate mass migration." 


? Increase in Local Conflicts, Regional Wars, Breakdown of Peace: As mentioned, 
climate impacts can produce additional food shortages, water shortages, and mass 
migration. This means that more local conflicts will break out, which could be 
exacerbated into regional wars. Increased terroism is also likely. From the QDR2014 (ref, 
p. 8): *The pressures caused by climate change will influence resource competition while 
placing additional burdens on economies, societies, and governance institutions around 
the world. These effects are threat multipliers that will aggravate stressors abroad such 
as poverty, environmental degradation, political instability, and social tensions — 
conditions that can enable terrorist activity and other forms of violence." 


100 Weakened Governments: From the QDR2010 (ref, p.85): "Assessments conducted by 
the intelligence community indicate that climate change could have significant 
geopolitical impacts around the world, contributing to poverty, environmental 
degradation, and the further weakening of fragile governments." 


101 Coastal Flooding; Existential Threat to Island Nations: Millions of people living 
near a seacoast will be subject to increased flooding due to sea level rise and increased 
storm surges. Some Island Nations are threatened with physical disappearance. 


102 Animal Migration, Biodiversity Loss, Species Extinctions: We humans depend for 
our survival on animals, plants, and ecology in profound ways. All are at risk from 
climate change. We ignore these risks at our peril. 

Wild animals are already trying to adapt to climate change, migrating to new places 
(mostly northward) to attempt to keep a steady environment. This is a powerful indicator. 
You don't need a climate model to see it. The animals are speaking directly to us. 


Chapter 53: Climate Change Risk Management 911 


Appendix III. Mitigation and Adaption for Climate Change 


There Is a Free Lunch. It’s Called the Sun 


If Earth is to remain livable for our descendants, we need to mitigate climate 
change seriously, eliminate BAU mentality, and adapt to climate change when 
needed. Ethical, moral, and survival aspects are paramount. 


Reducing Fossil Fuels and other Aspects of Climate Change Mitigation 


There are already many programs of climate mitigation at all levels from 
individuals to governments, with varying levels of effectiveness (see e.g. the U.S. 
program reports ^'^). There is no silver bullet solution, and a portfolio of actions 
will be necessary ^. Substantial increases in mitigation will be necessary to avoid 
unacceptable future climate impacts (ref. IPCC WG3 Mitigation Report, supra). 
Fossil fuels enabled the development of modern technological civilization 
and constitute most current energy. Unlimited use of fossil fuels may have the 
fatal side effect of ultimately rendering the planet unlivable for our descendants. 
Climate mitigation primarily involves reducing fossil fuel use, called “deep 
decarbonization" *°* To keep the planet livable, we need to decrease (not 
increase!” °) fossil fuel use with “all deliberate speed" using other paradigms of 


Biodiversity Loss: The Economics of Ecosystems and Biodiversity (TEEB) is a 
global initiative focused on drawing attention to the economic benefits of biodiversity 
including the growing cost of biodiversity loss and ecosystem degradation. There is a 
biodiversity convention, a multilateral treaty. 

Extinctions: Biologists warn about climate-induced mass extinctions, perhaps on the 
order of previous mass extinctions on the planet. Substantial climate-induced extinctions 
could happen relatively soon. Estimates are: 

(1) 15-37% of species extinction by 2050 under mid-range climate warming. 

(2) 10-25% of species could face extinction by 2050-2100 under BAU. 

(3) 10% species extinction by 2100 under BAU. Another estimate is that 50% of 
species will be lost due to climate change (no time scale specified). 


103 No Silver Bullet Solution to “Solve the Climate Problem”. Instead of thinking 
about a 100% solution, think about twenty 5% solutions. This is the counter argument to 
the complaint about any particular action proposal: “That won’t solve the climate 
problem”. Any given action only needs to solve part of the climate problem, although 
some can be more effective than others. A portfolio of climate actions is also desirable 
because some particular action may be obstructed or stopped by political or economic 
forces (App. IV). 


104 Fossil Fuel Development and Climate Risk Management: “Drill, Baby, Drill" 
programs and development of tar sands (e.g. the Keystone XL pipeline from Athabasca, 
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alternative/renewable energies ^*^ eventually including fusion energy^" and 
advanced nuclear fission ^". The process of transition including carbon budget 


constraints" must be made consistent with energy етапа! 0! ** and without 


Canada) or of other fossil fuel sources are backwards steps for the mitigation of climate. 
Natural gas development with fracking is better than coal, but has problems (below). 

A better strategy for climate risk management would be the aggressive development 
of renewable energies rather than the “old dinosaur" mindset of using more fossil fuels. 

Huge amounts of money are spent on fossil fuel exploration and development. UNEP 
says (ref) "spending on exploration and development of new fossil fuel reserves by the 
200 largest listed fossil fuel companies totalled US$674 billion in 2012, three times the 
global investment in clean technology". Such expenses could be a destabilizing factor for 
the fossil fuel industry. 


15 Alternative / Clean / Renewable Energies, Exponential Increase: Leading 
technologies are: wind, solar (distributed, centralized), geothermal, fusion (tokamak or 
other), 4" generation nuclear fission (fast neutron integral breeder reactors), advanced 
biofuels (switchgrass, poplar trees, algae, hemp), ocean currents, other hydropower. 
Smart power grids and better battery/storage technologies are needed. Combinations of 
sources can be used to “keep the lights on" (KTLO), to cope with intermittency. 

For climate risk management, these alternative energies must eventually transition to 
replace fossil fuels to become the main energy sources. See refs. for details. 

It is a mistake to look only at the static amount of renewable energies. Dynamically, 
renewables are increasing exponentially. A one percent effect only needs to double seven 
times before it become 100%. 


106 The CO2 budget, Dilemma of development (energy demands / energy parity) vs. 
climate impacts, Transitioning from fossil fuels to Renewables; China and India. 
Humanity is rapidly using up the physics-based CO; budget that is defined by the 
physically allowable fossil fuel consumption consistent with a given temperature increase 
(e.g. 2°C). This requires that substantial recoverable carbon must stay in the ground 
(stranded assets), with transition of fossil-fuels to renewables. 

I have no illusions of the difficulties of changing the energy infrastructure - 
technically, economically, politically, societally, and psychologically. 

The dilemma is to resolve limiting climate impacts with worldwide energy demands 
including energy needed for economic development of poor countries and the goal of per- 
capita average energy parity. Development cannot involve undiminished (or increased) 
use of fossil fuels without future potentially devastating climate change impacts. 

Most CO; now in the atmosphere is from developed countries. This will change 
under scenarios to most atmospheric CO; being from currently underdeveloped countries 
as they develop. This must occur if average energy parity is approached because most of 
the global population is in currently underdeveloped countries. Note that the definition of 
energy parity will also change if the U.S. increases per-capita fossil-fuel consumption. 

Use of the worst fossil fuel (coal) worldwide is increasing, especially in China and 
India. The recent (Nov. 2014) bilateral U.S.-China agreement has China's target to 
expand total energy consumption coming from zero-emission sources to around 20% by 
2030, which means 80% of China's energy will still be from fossil fuel sources in 2030. 
China proposes to peak fossil fuel use in 2030 (with intent to peak earlier), and then 
decrease fossil fuel use. India and the US concluded a climate and clean energy 
agreement (Jan. 2015) to increase the use of renewables. 

Renewables are rapidly expanding notably in Germany, China, India, the U.S., 
Japan, and Africa. 


Chapter 53: Climate Change Risk Management 913 


compromising economic development. Fortunately, renewable energy costs are 
rapidly declining, as measured by the Levelized Cost of Energy (LCOE, ref). 

Climate mitigation has other aspects: (1) Energy efficiency improvements for 
buildings and industry™. (2) Transportation paradigm shifts (mass transit 
enhancements, new vehicle technology)". (3) Mass carbon sequestration 5^"! 
(4) Technology advances'?. (5) Financial: Climate-friendly, Green bonds; 
Carbon fee or tax (6) Stopping deforestation, better land use! lox (7) Less meat™. 
(8) Better waste management; “Reduce/Reuse/Recycle”™. (9) Carbon neutral 
pledges by universities, businesses, etc “". (10) Carbon offsets^"". 

Successful climate mitigation can become reality if all sectors become 
substantially more involved: (1) Individuals^"". (2) Corporations (supra). (3) 
NGOs (environmental, religious, educational). (4) Cities” (over 50% of people 
live in cities worldwide). (5) State and regional governments?"". (6) National 
governments (financing™”", legislation, regulation t". (7) International 
conferences producing a (FAB?) climate agreement !'^!?*** (8) The Arts™. 


People in rural regions with no power grid find distributed solar attractive. There are 
over one billion people without access to electricity, mostly Africa and Asia (ref). 


107 The Carbon Budget and Future Generations: Note that if the present generation 
uses up the carbon budget, no carbon will be left for future generations to use, given 
constraints from climate mitigation. 


108 Carbon Sequestration: A one-liner: to be “safe” (J. Hansen), CO; concentration 
should be limited to 350 ppm. Today 400 ppm exists, so 50 ppm needs to be removed. 
Forests provide the most visible sequestration; cf. the UN program REDD (ref). Carbon 
that could be extracted from the atmosphere and sequestered (green plants or special 
trees, storage caverns, biochar, advanced technology...) would reduce global warming. 
Some agricultural techniques naturally sequester carbon, such as holistically managed 
grasslands and other no-till practices. CCS (Carbon Capture and Storage) is being 
developed for coal facilities. 

Carbon sequestration (other than trees) has not been demonstrated on economically 
viable large scales with known, acceptable risks. One risk is increased earthquake activity 
from unstable faults. Substantial development is needed. 


10 New or Old Technology for Climate Action? I am a big fan of new technology. We 
need better technology for CCS, energy, batteries, etc. Doubtless there will be new ideas 
that will prove attractive. However I also believe that it is foolhardy to close our eyes and 
put blind unlimited faith in unknown technology advances. One issue with new 
technology is that lab-to-production takes many years (one example is fusion), and we 
don't have very long to act in a robust manner on climate to avoid unacceptable impacts. 


In fact, we can largely act on climate change risk using current technology. 


110 Story: Deforestation, red beetles, and warming climate: A Canadian attendee at the 
Copenhagen Conference told me about flying in a helicopter above forests that had turned 
red — due to mountain pine beetles that had killed the trees. Winters were no longer cold 
enough to kill the beetles. See ref. 


1 EPA Standards, Procedures, History: The U.S. Environmental Protection Agency 
(EPA) is required under law and a Supreme Court decision to regulate greenhouse gases 
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Carbon Productivity 


Successful action on climate change must support stabilizing atmospheric GHGs 
and maintaining economic growth”. For this, "carbon productivity", the amount 
of GDP produced per unit of CO, emitted, must increase dramatically. 


Programs of Climate Change Adaptation 
Some adaptation" to climate change impacts will be quite necessary. The 
human race is no longer composed of tribes of hunter-gatherers, but billions of 


people generally living in highly arranged societies with limited mobility. The 


(e.g. by setting standards), given the EPA endangerment climate finding. That is, the EPA 
has no choice but to regulate. 

EPA Procedures: The EPA engages in lengthy public hearings, accepts public 
comment, and produces response documents. 

EPA Origin: The EPA was established by President R. Nixon in 1970. See refs. 


12 International Climate Agreement Issues: The road to a climate agreement is very 
rocky. Emissions reduction negotiations are now on a voluntary basis, by country, called 
INDCS (Intended Nationally Determined Contributions). Other issues include: 


(1) Cumulative vs. current emissions (China is now the #1 CO; yearly emitter but the 
U.S. has cumulatively emitted more CO; than China). 


(2) Energy equity (ethically speaking, why shouldn't villages in India have the same 
per-capita energy as in developed countries). 

(3) Funding for technological assistance to poor countries that did little to create the 
climate problem (Green Climate Fund). 


(4) National (and local) interests generally. 

(5) Common but Differentiated Responsibilities CBDR and “respective capabilities”. 

(6) Climate damage compensation for poor countries. 

(7) Legal status of agreement (binding or not) and enforcement provisions. 

(8) MRV: Measurement, Reporting, and Verification. 

(9) Support for climate adaptation of developing countries. 

(10) NAMAs (Nationally Appropriate Mitigation Actions). 

International climate talks are held under the UN Framework Convention on Climate 
Change (UNFCCC), an international treaty with 195 Parties, including the United States. 

All countries agreed at the 2011 UNFCCC meeting to the “Durban Platform for 
Enhanced Action”, to write a climate agreement in Paris at COP21 in 2015. The language 
is to "develop a protocol, another legal instrument or an agreed outcome with legal force 
under the Convention applicable to all Parties". The target implementation is by 2020. 

It should be noted that for international climate agreement negotiations, the 
reductions in greenhouse gases are voluntary by country. Developing countries are not 
required to reduce greenhouse emissions. 

Climate negotiations also occur bilaterally (e.g. U.S. and China, ref). 


13 “FAB”: This means “Fair, Ambitious, Binding". The term was used in the 2009 
Copenhagen Conference to describe a desirable international climate agreement. At this 
time, a climate treaty (as opposed to an agreement) is politically infeasible. 
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worse the impacts, the more difficulty humanity will have adapting to this new 
world. 

The amount of adaptation depends on the amount of mitigation (“ап ounce of 
prevention is worth a pound of cure”). Mitigation timing is critical. Mitigation 
sooner (rather than later) means less adaptation needed. 

An adaptation example is developing crops more resistant to climate impacts. 
Another is raising structures against future sea level rise''*“**"". Some adaptation 
in a BAU scenario will be undesirable mass climate migration. 


Geo-Engineering or Climate Engineering 


“Geo-engineering” or “Climate Engineering” refers to various proposed 
actions! 15% generally unconnected with mitigation attempts to alleviate the root 
causes of the climate problem. I regard most geo-engineering proposals as not 
sensible - a last resort that our descendants may be forced to use if we do not act 
responsibly. They will not be happy about it. 


Popular Pressure for Climate Action 


Popular pressure advocating for climate action is becoming more vocal. The 
“People’s Climate March” in New York City in 2014 featuring over 300,000 
people, accompanied by worldwide demonstrations, was a notable example ^". 


Carbon Fee and Dividend, Cap and Trade, Other Market Mechanisms 
e Carbon Fee (Tax) 


14 Raising the Port of Rotterdam: At the 2009 Copenhagen Conference, I spoke with a 
climate scientist who was using regional climate models to help plan how much the port 
of Rotterdam should be raised to cope with rising sea levels due to climate change. This 
sort of analysis will become more common for adaptation and resiliency actions. 


15 Geo-Engineering: A number of dubious schemes have been proposed. I grew up in 
Los Angeles where it was smoggy - the sky was not blue. One idea (“solar radiation 
management") is to pump chemicals into the atmosphere to reflect sunlight; the whole 
world would then have a sort of smog experience. This would not ameliorate ocean 
acidification and it could never stop - otherwise unmitigated global warming would 
resume as if the scheme had never been implemented. Other problems exist, including: 

(1) Distraction from solving the climate problem through mitigation. 

(2) Can't test a geo-engineering scheme at global scales over long times to see if it 
works without having major unintended consequences. 

(3) Undesirable regional variations. See refs. 


Sometimes carbon sequestration 1s listed as geoengineering, but I think this should 
actually be classified as mitigation because it reduces atmospheric СО». 
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exxvi 


A carbon fee or tax offers а fair market method to reduce СО, 
emissions! !, The fee compensates the public for the damage done by climate 
change caused by fossil fuels, and for which the public has to pay, through 
revenues, This should be a cost of business in the price of fossil fuels, but 
currently is not (although some countries do have some carbon tax^"""). Carbon 
fees are a “Pigovian tax" for the "negative externality” of generating costs for the 
public due to climate impacts of burning carbon. The present economic situation 
with fossil fuels is a distortion of a free market, with uncompensated negative 
externalities™*”"". The fossil fuel industry today basically operates with “private 
profits and socialized risk", with the public picking up current and future risk 
expense due to climate change impacts from fossil fuels, plus subsidies (below). 
Litterman and others analyzed the financial and risk aspects of a carbon tax, 
including possible catastrophic economic tail risk due to climate impacts cf. ^*. 


A. Carbon Fee + 100% Dividend; the Citizens’ Climate Lobby 


The Citizens’ Climate Lobby CCL is a non-partisan organization that advocates 
for a revenue-neutral “carbon fee + 100% dividend” program that prices carbon 
in fossil fuels!? ***, The fee is at extraction (coal, oil, natural gas), simplifying 
the collection. The fee would start at a low level and increase each year. The 
CCL proposal significantly cuts CO; emissions over time and is positive for the 


16 Carbon Fee/Tax Units: A carbon fee or tax is in units of $US/ton of CO, (not C). 


17 Efficiency of a Small Carbon Fee/Tax; Acceptability: If the carbon tax is smaller 
than the market price swings of a fossil fuel, the tax on that fuel will not be efficient. A 
small carbon tax, although moving the price up somewhat, could get lost in the market 
volatility, and then could have only marginal effects on the economy or on fuel demand. 

A low carbon tax may however be acceptable from a political standpoint and 
acceptable by the fossil fuel industry for lowering possible legal and reputational risk. 

A carbon fee/tax will be passed on to the consumer through higher energy prices. 
However fee/tax revenues can be used to offset these effects, discussed below. 


18 Carbon Fee/Tax - Revenue Possible Uses: The C fee/tax revenues could be used in 
different ways or in combination: 

(1) Return the money back to households (Citizens’ Climate Lobby proposal). 

(2) Tax swap (lowering other taxes). 

(3) Increase government support for renewable energies. 

(4) Lower the national debt. 


19 More on Carbon Fee and 100% Dividend: A border adjustment would compensate 
for imported goods from countries that do not implement a similar carbon fee program. It 
is possible that this would provide incentives for other countries to institute their own 
carbon fee programs. Because the program benefits the U.S., there is no economic reason 
for the U.S. to wait for action from other countries. Theoretically there is little 
administrative cost and no new government department is needed. The proposal is 
designed to be non-partisan. See above for the REMI analysis of the CCL proposal. 
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economy (above). Recently the CCL created the “Pathway to Paris” organization 
lobbying for a price on carbon in international climate negotiations. 


e Fee апа Dividend vs. CAFE: A Poll 


The Initiative on Global Markets (IGM) took a poll of their Economic Experts 
Panel on a carbon tax. The statement posed was: "A tax on the carbon content of 
fuels would be a less expensive way to reduce carbon-dioxide emissions than 
would a collection of policies such as "corporate average fuel economy" 


xxxi 


requirements for automobiles" (CAFE). The result was 90% agreement" . 


e Cap and Trade / Emissions Trading 


cxxxii 


Cap and Trade (or Emissions Trading) puts a price on carbon. A central 
authority sets a limit or cap on the amount of CO, that can be emitted. Firms buy 
or sell (trade) emissions permits according to their needs. The buyer pays a 
charge for polluting, while the seller 1s rewarded for having reduced emissions. 
The largest program is the E.U. Emissions Trading Scheme. The RGGI 
(Regional Greenhouse Gas Initiative) is a cap-and-trade program comprised of 
participating U.S. states. Other carbon trading platforms exist. See refs. 


e The Clean Development Mechanism (CDM) 


The СОМ" allows emission-reduction (or removal) projects in developing 
countries to earn certified emission reduction (CER) credits to satisfy Kyoto 
protocol requirements. CERs can be traded and sold. 


e Renewable Portfolio Standards (RPS) 


A Renewable Portfolio Standard (RPS) is a regulation that requires the increased 
production of energy from renewable energy sources ^*^. Some U.S. states have 
an RPS, but there is no U.S. national RPS. See the DSIRE website". 


e Utilities and Renewable Energy, Feed-in Tariffs, Smart Power Grids 


Utilities that manage the extensive power grids fed by fossil fuels are becoming 
threatened, and some are transforming. This is due to distributed renewable 
sources (e.g. solar) not needing grid power. Renewable sources are paid for 
power put into the grid by the utilities. Feed-in tariffs are long-term contracts. 
Some utilities are investing in renewable energies. Others oppose renewable 
energies or advocate for extra fees. The utilities’ dilemma is retaining established 
but increasingly expensive fossil fuels vs. increasingly cheaper renewable energy 
sources and smart climate policies. Utilities are beginning to employ energy 
conserving “smart power grid" technology. See refs ^", 


e Environmental Derivatives 


20 DSIRE: DSIRE is the comprehensive source on incentives and policies that support 
renewables and energy efficiency in the US, funded by the U.S. Department of Energy. 
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These include futures (e.g. on RGGI Allowances) and futures options, renewable 
energy certificates, sustainable equity indices, insurance on catastrophic weather 
events, etc. See the Chicago Climate Futures Exchange® ^"? 


Other Energy Considerations: Natural Gas, Fossil-Fuel Subsidies 
e Natural Gas, Fracking, Coal Displacement in the U.S. 


Natural gas (NG) is rapidly displacing coal in the U.S. because NG is cheaper 
than coal per unit of energy. Moreover the "Beyond Coal" movement has been 
effective in opposing new coal power plants^" ^". The huge increase in NG is due 
to the technical development “fracking”, an advanced process that fractures rocks 
to release NG. Burning NG under controlled conditions releases less CO; than 
coal for the same energy, positive for mitigating climate change. However there 
are substantial NG issues ^ 9» 


e Fossil fuel Subsidies and Climate Mitigation 


The price of fossil fuels will have to increase substantially to reduce fossil fuel 
demand and to mitigate climate change successfully. Fossil fuel prices do not 
reflect true costs due to large direct and indirect fossil fuel subsidies ^ ^" 


121 Natural Gas (NG), Fracking, and Climate Change Issues: Not so simple. Four 
main problems exist: 

(1) Methane release compromises the advantage of NG over coal for climate; 
methane is a powerful greenhouse gas. This risk can be reduced through careful controls. 

(2) Investment in NG can impede large-scale development of renewable energy 
because NG can use up available investment capital for energy demands. If so, NG will 
not be a “bridge” to renewable energy, and long-term climate mitigation will be impeded. 

(3) Fracking destroys rock formations, making these formations unusable for future 
potential, carbon sequestration sites that may be needed for climate mitigation. 

(4) Water issues: (4a) Large water requirements for fracking. (4b) Ground water 
contamination (methane-contaminated faucet water has been filmed being set on fire). 
(4c) Fracking needs chemicals that can contaminate water supplies. 


7? Fossil Fuel Subsidies: Fossil fuels enjoy huge socialized subsidies, in the U.S. the 
total of half a trillion dollars in U.S. government subsidies since 1950. Worldwide, annual 
fossil fuel subsidies are about $550 billion per year (half a trillion US dollars per year). 
The fossil fuel industry also receives indirect public subsidies that are not reported: 

(1) Public subsidy coping with climate damage caused by fossil fuels. 

(2) Public subsidy of the transportation infrastructure (roads, parking), essential to 
the oil industry. 

(3) Public subsidy of military expenditures to attempt to secure oil supplies overseas. 
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Appendix IV. Risk from Climate Contrarian Obstruction 


“Closing your eyes is not a strategy!” 

Climate change “contrarians” and contrarian media are a major obstruction to 
climate change risk management. Because humanity has a limited time before 
climate impacts will become substantially worse and more costly, contrarian 
obstruction to climate risk management constitutes substantial risk 2. 


Who are the Contrarians? What is their Agenda? What about Denial? 


Climate "contrarians" ^^"^ аге a loud vocal minority". The references have 


information on who contrarians are and what they say™, including analysis and 
commentary. Targeted, unreliable contrarian material of poor quality is 
widespread, amounting to disinformation"* against climate science" and risk 


123 Wise Saying: Sign carried at the People's Climate March in NYC 9/21/14. 


124 Discussing Contrarian Risk. Contrarian obstruction of climate action is the elephant 
in the room. I think I have a responsibility to discuss this difficult issue. Authors of other 
reports are quite aware of contrarian obstruction, even if their reports do not discuss it. In 
my opinion, risk management should deal with reality, including opposition to dealing 
with risk. One downside is giving contrarian arguments some publicity, but contrarians 
already have an army of promotors. A second downside is that discussing fallacious 
contrarian arguments is a distraction to discussing responsible climate action. 


125 Skeptics or “Faux-Skeptics”? Climate-change contrarians (or climate contrarians for 
short) are often called “skeptics”, but I think this is inaccurate. Scientists are generally 
skeptical until they are convinced by the evidence; at which point scientists are skeptical 
of contrarian counter-arguments. Climate contrarians are not real skeptics — they believe 
their claims and generally are unwilling to change regardless of evidence. I think they 
could be called “faux-skeptics”. 


126 “Denial” and “Deniers”: I take “climate denial” to mean generally systematic 
rejection of the enormous body of mainstream research summarized in the IPCC reports 
(supra) for climate science (WG1), climate impacts/vulnerability/adaptation (WG2), and 
climate mitigation (WG3). The appelation “deniers” (as in "climate science deniers" or 
“climate risk deniers” or “climate mitigation deniers”) is often used. 


127 Contrarian Minority: Cf. Yale study “Global Warming Six Americas", ref (supra). 


128 Contrarian Media, Climate Disinformation: Climate disinformation, as detailed in 
the references, consists of loud, ill-founded critiques of mainstream climate science, 
climate impact studies, and climate risk management action. Influential and widespread 
contrarian media (newspapers, magazines, TV, blogs, talk radio), right-wing politicians, 
commentators, and some religious fundamentalists, propagate climate disinformation. 
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management by climate contrarians, contrarian media™", some think tanks'??, 
and others. A major tactic is influencing public opinion psychologically (dislike 
of a solution decreases acceptance of evidence) “'". This parallels former tobacco 
industry tactics opposing tobacco legislation °°", 

The underlying media/political contrarian motivation is a small-government 
libertarian/fossil-fuel/political agenda aiming at opposition to government risk 
management action on climate" ^", 

The contrarian strategy cherry-picks capriciously optimistic climate scenarios 
and ignores risk management principles as described in this book. 

There are, however, conservative leaders who are positive toward climate risk 
management *"", and some argue against contrarians. Other conservatives could 
support climate risk management in a different environment. 

I think climate impacts will eventually become so evident that climate 
contrarians will lose influence. The message “we are all in this together" can 
prevail, essentially the Prisoners Dilemma  game-theory co-operative 
solution ^*^ Conservatives supporting climate risk management will become 
the rule. Those now “undecided” will act. Hopefully it won't be too late. 


. . 3 MAS ‚_ 133 
Mainstream media also carry some climate disinformation ~~. 


Sophisticated contrarian media “stay on message” in orchestrated PR “echo chamber” 
mode. Contrarian comments and ratings on books and websites are sometimes organized. 

Contrarian media tactics aim to influence the public not to act on climate. Studies 
show that contrarian media are effective for viewers who trust those media. Contrarian 
media constitute a virtual bubble on climate change disconnected from reality. 


12 Contrarian Think Tanks: Some right-wing think tanks have explicit contrarian goals 
to counter mainstream climate science. They hire or support scientists and non-scientists 
who have contrarian viewpoints, generate contrarian climate reports, hold contrarian 
climate conferences, and try to push contrarian “education” material into schools. These 
think tanks are trying to take over climate science for political ends (e.g. “small 
government"). They receive funding from right-wing sources (see below). 

Contrarian think tanks should be contrasted with real scientific departments in 
universities or laboratories, and with real scientific societies. I am not aware of any 
scientific society that supports climate contrarian positions against mainstream science. 


130 The Tobacco Industry and Climate Contrarians: Tobacco industry tactics and 
climate contrarian tactics are similar. For example, in tobacco industry words, "4 demand 
for scientific proof is always a formula for inaction and delay" (Bates, ref). The same 
contrarian tactic sows doubt by attacking climate science as not being “certain”. Some of 
the very same people that denied tobacco risk are involved in climate risk denial (refs). 


P?! Price on Denial? Al Gore suggests putting a price on climate denial in politics (ref). 


132 Prisoner’s Dilemma and Climate Action: Climate action is like game theory with 
co-operation through mitigation giving better outcomes than BAU following self-interest. 
Rational players will co-operate. This does not mean that real people co-operate. 


133 What about Climate Science Disinformation in Mainstream Media? Mainstream 
media also have some climate disinformation, e.g. well-meaning but distorted analyses by 
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Where do Climate Contrarian Organizations get Funding? 
Reports (e.g. ^) assert that huge amounts of funding from right-wing sources 
go to organizations that publish climate contrarian material, perhaps up to $1 


billion per year'**. Much such funding is hidden using intermediate organizations. 


The Contrarian Attack on Climate Science 


A deliberate tactical decision was made by contrarians to attack mainstream 
climate science, in order to replace discussing climate risk management. Science 
is thus being used by contrarians as a tactic to distract from action on climate 
risk. The Luntz memo was a turning point?» 

So we need to discuss science and science distortion in order to shed light on 
the contrarian misuse of science. Any field of science has its mavericks arguing 


some writers without much knowledge of climate issues. Science writers with some 
technical background can report climate issues from mainstream sources relatively 
accurately, and there are excellent science reporters. However the mainstream press has 
been firing science reporters, due to budget problems. 

Some mainstream media, in misguided attempts to "present the controversy", give 
contrarians unwarranted coverage. For almost all primary-source climate scientists there 
is no controversy over fundamental climate science, so “media balance" is a distortion. 


134 Contrarian Funding: "J call it the climate-change counter movement. It is not just a 
couple of rogue individuals... This is a large-scale political effort" (R. Brulle, ref). 


P5 The Luntz Memo; the *Global Climate Science Communications Action Plan"; 
“Climate Change vs. Global Warming”; the Politization of Climate Science: A 2002 
memo by the Republican strategist F. Luntz was influential in the contrarian political 
tactic to attack climate science. The memo said: “you need to continue to make the lack of 
scientific certainty a primary issue in the debate... ". 

Luntz was apparently also behind the idea to change the framework from "global 
warming" to “climate change", which psychologically sounds less severe (ref). 

The public is generally uninterested in science (as opposed to technology) and has 
little understanding of science. This made the Luntz strategy more effective. 

In 1998 the American Petroleum Institute API and others formed a “Global Climate 
Science Communications Plan" with a “National Media Relations Program". The stated 
goal was to "undercut "conventional wisdom" on climate science" (ref). 

Politics should have no place in science. However some politicians, the fossil fuel 
industry, and contrarian media deliberately politicized climate science and climate risk 
management, as shown by the Luntz memo and the API media relations program. 

The U.S. Republican Party leadership is currently in full denial of mainstream 
climate science and is vigorously obstructing climate risk management. It is significant 
that 49 out of 53 Republican Senators voted against the Schatz Amendment #58 (2015), 
which read: ".../t is the sense of Congress that— (1) climate change is real; and (2) 
human activity significantly contributes to climate change." However not all Republicans 
think alike, and there is a difference in what some say privately vs. publically. See ref. 

Some coal state Democrats, fearful for their constituents' jobs as perceived, and their 
own, are complicit in obstruction. 


Pointing all this out simply acknowledges reality. 
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against the prevailing scientific paradigms. The vigorous scientific process and 
data weed out untenable arguments, which eventually just die away. Contrarian 
media promote untenable contrarian claims outside normal scientific process. 
The contrarians include some scientists, but only a few contrarians have 
appropriate climate credentials to be a primary expert source of information on 
climate science'*” ?* * Some contrarians do not have an overt agenda but rather 


have an incomplete understanding of the relevant science ^? 49141142 di 


136 Scientifically Discarded Ideas and Persistence of Incorrect Contrarian Claims: 
While the history of science is littered with forgotten scientifically discarded ideas, 
manifestly incorrect contrarian claims continue through support by contrarian media. 


137 Do Contrarians have Primary Climate Science Credentials? A few (and ONLY a 
few) contrarians have credentials to be primary source climate science experts, meaning 

(1) Holding a science PhD and a staff position related to climate at a university or 
laboratory, and 

(2) Having research published recently in respected peer-reviewed climate journals. 

It is generally not enough just to have a science degree (even a PhD, even a 
distinguished research career) to be a “climate science expert". Here is an analogy: a 
medical technician (even a professor of dentistry) does not generally have the knowledge 
to be a "cancer expert", let alone criticize oncology research. Those without in-the-field 
science credentials are unqualified to criticize mainstream science, notably contrarians 
who do not have climate science credentials as described above. 


P* Homework: Check some of the mainstream science references at the end of this 
chapter and note the expertise of the authors (credentials, affiliations — see footnote 
above). Start with the IPCC reports. Contrast with authors of contrarian reports. 


13 Contrarian Media Misunderstanding of Mathematics vs. Science; Science cannot 
be “Proved”, Legalistic Nitpicking, Uncertainty and Certainty, “Theory”: Science is 
what scientists do, not what outsiders think science “should be". Amateurs do not get a 
vote for what should constitute the "scientific method". Contrarian media and some 
contrarians routinely misunderstand and misrepresent the very nature of science. 

Science consists of a web of facts and theory. Evidence from independent sources 
makes the science stronger. Science uses mathematics as a tool, but science is essentially 
different from mathematics. Whereas one incorrect detail invalidates a mathematical 
proof, a single strand in the web of scientific evidence is often of minor importance. 

A main focus of contrarian disinformation is to “instill doubt" with the pseudo-legal 
fallacy that climate science has to “prove” something, as in a court of law. There is no 
proof in science. Scientific data and theories are established to a greater or lesser degree, 
but science is never “proved”. Science always has some uncertainty, which contrarians 
chronically misrepresent. We know some things with relative certainty and others with 
less certainty. Establishing the level of certainty is a large part of doing science. 

A scientific idea can be falsified if the idea unambiguously predicts a phenomenon 
that is unambiguously contrary to real data, as determined by scientific experts in the 
field. Science in the real world is more nuanced than this idealized process. In particular, 
science is not “disproved” by nitpicking an incorrect or un-established detail. Also, the 
mere existence of a contrarian argument is not a “counterexample” in science as a 
counterexample could be in mathematics. 

"Theory" is a word with scientific meaning. To a scientist, a "theory" is an 
established, tested and substantiated framework with concepts and equations, consistent 
with available data. I used to teach Maxwell's Theory of Electromagnetism and Einstein’s 
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The main problem contrarians have is that the greenhouse gas effect has been 
established basic physics for over 100 years. Contrarian claims cannot explain 
away physics; the usual ploy is to distort its consequences (cf. fallacies below). 

Some contrarians, media, and politicians have directly attacked some climate 
scientists and science itself 13:14, scientists have responded? un 


Theory of Relativity to physics graduate students. The mistaken popular idea that 
"theory" means some sort of “guess” is wildly incorrect and gives a false impression. 

Contrarians and contrarian media regularly misuse the words “Proof”, “Uncertainty”, 
and “Theory”, exploited as tactical weapons against mainstream climate science. 


14 Ts Climate Science “Settled”? Answer: climate science is “Settled Enough". The 
science on the basic elements of climate change is indeed settled (existence of the 
greenhouse gas effect and the anthropogenic cause of recent global warming). Scientists 
argue about detailed points all the time. Contrarians exaggerate details and misleadingly 
claim that the science is not "settled". However, a large part of climate science is so well 
established that for practical purposes, climate science is NOT uncertain and IS settled. 


141 «Extraordinary Claims Requires Extraordinary Evidence" vs. Contrarians: The 
excellent statement is from Carl Sagan. An extraordinary contrarian claim is that the 
primary climate science experts are wrong. For that extraordinary claim, contrarians have 
no viable evidence, let alone extraordinary evidence. 


In the past, an extraordinary claim was that humans can influence the climate. Solid 
evidence for this claim convinced the community of naturally skeptical scientists that are 
now mainstream climate scientists. Contrarian media misrepresent or ignore this history. 


12 Two Contrarian Lines and Rebuttals: One contrarian line is “climate science is just 
theory and full of uncertainty, so we shouldn't act on climate". The rebuttal is “climate 
science is solid, uncertainties are exhibited, and not acting on climate is irresponsible". 
Another contrarian line is: “I have an idea on climate change and you can't prove it's 
wrong, so we shouldn't act on climate". The rebuttal is “Your idea is not backed by 
evidence, “provability’ is a red herring, and not acting on climate is irresponsible”. 


"5 Contrarian Attacks on Scientists and Science: A common contrarian defense 
consists of going on the offensive against mainstream science, including attacking the 
scientific process itself. Contrarians falsely accuse scientists of doing the very things 
contrarians themselves do, e.g. cherry picking data and constructing flimsy arguments. 
Contrarian media and some politicians engage in ad-hominem attacks, with high levels of 
ignorance and belligerence. They accuse climate scientists of having mercenary motives. 
Some contrarian organizations, individuals, and politicians file lawsuits or “Freedom of 
Information" requests that harass scientists and waste their time. Some climate scientists 
have received vicious hate mail and even personal threats (cf. M. Mann's book, ref). 


144 Climate Scientist Fear: More than one climate scientist told me that some climate 
scientists are afraid to speak out, due to of fear of such attacks. 


145 Scientists Respond to Contrarian Attacks: An extraordinary letter published in the 
journal Science (ref), signed by 255 members of the U.S. National Academy of Sciences, 
including 11 Nobel Prize laureates read: "Many recent assaults on climate science and, 
more disturbingly, on climate scientists by climate change deniers are typically driven by 
special interests or dogma, not by an honest effort to provide an alternative theory that 
credibly satisfies the evidence." 
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The Categories of Climate Contrarian Fallacies in Science and Risk 


Different contrarians claim and promote different arguments, some mutually 
inconsistent. Some contrarian material is fringe science "^ ™, Other claims 
resemble “Cargo-Cult Science" or pseudoscience with some scientific form but 
without scientific content. Various fallacies often exist in contrarian material. 

Non-experts cannot tell the difference'*”'** between contrarian arguments and 
real science. Mainstream scientists are quite aware of contrarian claims ^. 

Consider the example of global warming, the globally averaged upward 
temperature trend from anthropogenic climate change, observed since around 
1975. There are four basic fallacies of contrarian denial: 


1. Denial that Global Warming Exists 


The global warming trend has existed since 1975, and global warming will 
continue (cf. App. I). Some contrarians deny that global warming exists, cherry 
picking data or confusing weather with climate. Contrarians misused erroneously 


146 «Fringe? Science: Fringe science generally contains a speculative, relatively untested 
hypothesis or idea, departing from established science (ref). Fringe ideas can be generated 
by anybody. They can be legitimate and useful, or wrong and not useful. Occasionally a 
fringe idea that challenges a main pillar of established science is right — sometimes 
spectacularly so — but most fringe ideas turn out to be wrong. This is because established 
science is very difficult to achieve, including consistency with data and consistency with 
previously established scientific theory. An example of fringe climate science is the 
unestablished claim that cosmic rays are a main cause of climate change (ref). 

Fringe ideas should be distinguished from attempts to extend science on the 
boundaries of established science. Retaining consistency with established science and 
formulating ideas to extend it is a part of normal science discourse. 

What about disinformation? Fringe science by itself is not disinformation. The 
broadcasting of fringe science and discredited claims by contrarian media in order to 
attack mainstream science — which occurs regularly — does constitute disinformation. 


77 Hard for non-experts to tell the difference between contrarian claims vs. science: 

Contrarian disinformation is carefully crafted to sound plausible to laypeople who are 

unlikely to have the time, inclination, or ability to study and understand climate issues. 
Expert analysis is often required to see fallacies in contrarian material. 


It is difficult to write rebuttals to contrarian material. It is noteworthy that rebuttals 
usually get less publicity than the original contrarian disinformation. 


148 The Debate Fallacy: Science progress takes place in universities and laboratories with 
discussions, seminars, and peer-reviewed publications, and is not judged in public debate. 
“Debate” is a favorite contrarian tactic used to confuse non-expert audiences and get 
people to think there is no consensus — merely by having the debate in the first place. 


' Consideration of Contrarian Material: I have spent a lot of time going through 
contrarian material. In a perfect world, I would be happy if contrarians were right and 
there were no problems with climate change. No such luck - contrarians are just wrong. 
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analyzed satellite data'°® *". The BEST study was a notable exception *, A 


contrarian fallacy is the false implied long-term consequence of a "hiatus" or 
. . D 
slowdown in surface global warming "^. 


50 Faulty UAH Analysis of Satellite Data, Contrarian Attacks: Errors were made in 
Univ. Alabama Huntsville (UAH) analysis of satellite data (MSU 2LT), including an 
important minus sign error. These incorrectly analyzed satellite data seemed not to show 
global warming, which contrarians used to attack mainstream climate science. After the 
errors were (finally) corrected, satellite data exhibited better, though not complete, 
agreement with surface measurements (refs). In the end, mainstream science was right 
and the contrarians were wrong. See R. Pierrehumbert's video (ref) and footnotes below. 
One fundamental satellite data problem is that satellite data do not actually measure 
surface temperatures where people live, but rather integrated temperatures up to heights 
that are far above the earth's surface. Complex procedures are used, with uncertainties. 
Satellite temperature data have higher uncertainty than data from other sources. The 
last 30 years of satellite temperature data (since 1975) have 26 error +-0.06 °C/decade; 
the 26 error for other sources is +-0.03 to +-0.04 °C/decade (via K. Cowtan website, ref). 


11 The BEST (Berkeley Earth Surface Temperature) Study: BEST was partially 
staffed by contrarians and it was partially funded by contrarian sources. BEST concluded 
global warming exists and that humans are causing it. One prominent BEST researcher 
anounced he changed his mind from being skeptical about the existence of global 
warming to saying vigorously that “global warming is real" (ref). 


re Temporary Global Warming “Hiatus” or Surface Slowdown of Global Warming 
SSGW (a main contrarian talking point), and why it doesn’t matter: Planetary global 
warming has not stopped. Global warming still exists. The “hiatus” is a misnomer for a 
slowdown but not cessation of global warming at the surface or SSGW. We live on the 
surface, so surface global warming is the imporant issue. Mainstream science asserts that 
SSGW will not continue and is not important for climate mitigation planning. 

Less heat is escaping from the earth than heat coming to the earth, so the earth 
(atmosphere and ocean) continues to warm. SSGW arises from a La Niña state due to 
natural variation, notably involving oceans. Most heat from the sun goes into the oceans. 
Fluctuations of the fraction of heat going into the oceans have big effects on the heat 
retained in the atmosphere. Extra energy has recently transferred into the Pacific Ocean 
(at depths 700m - 2000m) from the atmosphere. The Atlantic and Indian Oceans are also 
involved. The cycle of more or less heat transfer into the ocean (because it is a cycle) will 
reverse, and the reversal time is expected long before 2100. Extra heat will come back out 
of the oceans into the atmosphere, restoring the faster surface warming trend. 

Other short-term natural effects contributing to the SSGW are recent small volcanic 
eruptions with a cooling effect, and an anomalous recent solar cycle. 

The SSGW is exaggerated by contrarians who cherry-pick the starting point as the 
unrepresentative year 1998 with the gigantic El Niño that brought huge amounts of heat 
out of the ocean to the surface, anomalously raising the 1998 surface temperature. 
Contrarians moreover cherry-pick satellite data that has the biggest 1998 outlier. These 
biases constitute the fallacious contrarian “global warming has stopped" claim. Other 
choices of starting times and other data sources invalidate this spurious contrarian claim. 


Climate models have individual runs, or paths of temperatures, that resemble the 
hiatus (Meehl, ref). The existence (though not timing) of slowdown periods is consistent 
with models. Contrarian attacks on models related to these points are highly misleading. 


See references, Appendix I. 
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2. Denial that Human Are Causing Global Warming 


Humans are now causing global warming (cf. App I). Some contrarians admit 
that recent global warming exists but claim it is not caused by humans. 


3. Denial that Global Warming has Harmful Impacts 


Myriad technical reports worldwide from universities and laboratories describe 
the increasingly serious consequences of global warming (cf. App. II). Some 
contrarians, while agreeing that global warming exists and is caused by humans, 
deny or ignore harmful impacts and illogically ignore the risk of uncertainty. 


4. Denial that We Can Mitigate Global Warming 


Given the serious problem of global warming, mitigation is imperative and 
possible, with positive opportunities (cf. App. III). Some contrarians admit that 
global warming exists, is caused by humans, and has harmful effects, but assert it 
is too expensive to mitigate, deny positive opportunities, deny responsibility, 
and/or finger other countries to carry the burden. 


Climate Contrarian Fallacy Lists 


e Contrarian scientific fallacies and pseudoscience ^^ ^^ evii, clix 


e Contrarian science-logic fallacies ^ 6e 


153 Personal experiences with contrarians; my “One-Liner Responses to Contrarian 
Fallacies": I have learned about climate disinformation and contrarian arguments first- 
hand through extensive interactions with climate science / climate risk contrarians. 

Typical "conversations" consist of me reporting reliable climate analyses, and then 
responding to contrarians who ignore these analyses and: (1) parrot talking points from 
contrarian sources, or (2) concoct non-sequitur statements with incorrect assumptions. 
They don't listen. Some contrarians do not seem to agree with any mainstream climate 
science or risk management. The drumbeat of disinformation and ignorance is loud. 

Separately, I once participated in an experiment of “counter the contrarians" with 
role playing to test which methods of communication were effective. 

These experiences led me to write the *One-Liner Responses to Climate Contrarian 
Fallacies", since expanded, posted on SkepticalScience.com under "Global Warming 
and Climate Change Myths” (ref). Have a look. 


7^ Contrarian Pseudoscience: The pseudoscientific contrarian “grasping at straws” list 
is long (176 fallacies as of November 2014 on the SkepticalScience “Myths” list). I'm not 
going to repeat the list. Details are on Skepticalscience.com and Realclimate.org (supra). 


155 Contrarian Science/Logic Fallacy List: In addition to distorting the nature of science 
the long list of fallacies from which some contrarians and contrarian media pick includes: 


(1) Ignoring, misunderstanding, misquoting, misrepresenting, and/or distorting 
mainstream science - theory and data. 


(2) Oversimplying, omitting caveats, taking exaggerated limits. 


(3) The “No Consensus” Fallacies (Re: the demonstrated consensus on the existence 
and main anthropogenic cause of recent global warming of around 97% of primary source 


Chapter 53: Climate Change Risk Management 927 


e Contrarian fallacies attacking the hockey stick ^^ ^" 


e Contrarian false dilemma - poor people and climate mitigation 1%" 


scientists). Fallacy Variations: (3a) Claiming no scientific consensus exists, so no action 
is “required”, (3b) Stating scientific consensus does exist but is irrelevant (ignores the 
evidence that led to consensus), (3c) Misrepresenting ordinary scientific debate as "no 
consensus", (3d) Quoting people without primary climate science expertise — or without 
any scientific expertise, (3e) Arguing there is no consensus over something else, (3f) Nit- 
picking the consensus percentage (e.g. 90% vs. 97%). 

(4) Cherry-picking and false generalizing: (4a) Isolated data points, (4b) Minor 
points in climate science, (4c) Isolated climate impacts with difficult attribution. 

(5) Sophistry: (Sa) Irrelevant red-herring diversions, (5b) Fabricating non-sequitur if- 
then statements having incorrect assumptions and false conclusions, (5c) False analogies, 
(5d) Quoting out of context, (Se) Euphemisms (e.g. “questioning science” really means 
“rejecting established science"). 

(6) Demanding unachievable and unneeded "precision" with the attack: “if you don’t 
understand this detail you don’t understand anything”. 

(7) Overemphasizing uncertainties. 

(8) Paranoid conspiracy fallacies. 

(9) “Contrarians are like Galileo” (an absurd, historically wildly inaccurate analogy). 

(10) Distorting scientists being convinced by the evidence into claiming scientists 
have “dogmatic belief". See Skepticalscience.com and RealClimate.org. 


Example - “Snowmaggedon vs. global warming”: “Look outside the window at the 
snow — does that look like global warming to you?” One fallacy here (there are several) is 
false generalization from small to global spatial scales. This is the same fallacy that 
would be behind absurdly concluding “The Earth Is Flat" by looking out the window. 


56 The *Repeatable Experiment” Fallacy: We are currently performing an 
unrepeatable and dangerous experiment on Earth. We cannot rewind the clock and redo 
the “experiment”. Contrarians use this elementary fact to attack climate science. However 
there are some repeatable “experiments” — e.g. using different techniques to measure 
temperature (land based stations, satellites). Models provide a framework to do “what-if” 
experiments with different conditions. Attribution studies where climate models are run 
“with” and “without” human CO, emissions constitute a good example. 


157 Contrarian fallacies attacking the hockey stick: The iconic hockey stick (App. I) is 
a fact, obtained by many independent research groups, and is based on data. Contrarian 
“critiques” are invalid for three reasons: 

(1) Incorrect claims regarding a “Medieval Warm” period (not worldwide, not 
temporally consistent, and not globally as warm as now), plus a misinterpreted old IPCC 
graph. 

(2) Incorrect criticism of sub-period-centered principal components (Ch. 48), and 
making the mathematical error of assuming the same number of principal components 
can be used independent of centering, which is mathematically incorrect. 

(3) Incorrect analyses of data. See RealClimate (ref) and the WGI IPCC science 
report (supra) for details. 


5 Poverty Elimination and Climate Mitigation — False Dilemma: A false dilemma or 
"conflict" between worldwide poverty elimination and climate mitigation is raised by 
some contrarians. Efforts to eliminate worldwide poverty through programs like the 
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e Contrarian fallacies regarding stolen emails (“Climategate”) ^? ЧҮ 


What do Contrarians Misunderstand about Climate Risk Management? 


Climate Risk Management deals with mitigation to lower the risks of future 
climate change and adaptation to adjust when necessary. For risk management, 
risk estimation must be carried out. This requires the best input, including data, 
models, scenarios, and statistical analysis. The IPCC expert reports (supra) 
collect and summarize the best available information for sensible climate risk- 
management. Contrarians have systematically attacked the IPCC, 

Some contrarians misunderstand (either by omission or deliberately) climate 
risk management. They give high estimates for the price of climate risk 
management, and low estimates for climate impacts and for the price of ignoring 
climate risk management *"". They promote increasing fossil fuel usage and 
oppose renewable energy development, ignoring climate impact risk. Contrarians 


Millenium Development Goals (MDG, above) are threatened by climate impacts. Poverty 
elimination and climate mitigation are necessary and complimentary, not contradictory. 

Discount rate problem with benefit/cost calculations: Different calculations may 
use different discount rates, invalidating some cost/genefit ratios comparisons. Note that a 
difference of only 1% in discount rate produces a factor of 2 in present value, over 85 
years (supra). 

Poverty Elimination? One fact about poverty worth noting: fewer than 100 people 
have the same wealth as the bottom half wealth of humanity, three and a half billion 
people. The worst poverty could be easily eliminated if the will existed. Contrarians 
could help by supporting the MDG and the UN Green Climate Fund (supra). 


'? «Climategate", Investigations, and wasted time: A barrage of unsubstantiated 
attacks on climate science, including false accusations of data manipulation, employed 
deliberately misinterpreted snippets from a cherry-picked few of the (thousands of) stolen 
emails, taken out of context with incorrect word interpretations (“Climategate”). For 
example the word "trick" (that commonly means a clever procedure in science) was 
blatantly misinterpreted as indicating deception. Multiple thorough investigations showed 
that these accusations were baseless (refs). 

The emails were stolen immediately before the 2009 Copenhagen Climate 
Conference in a clear attempt to sabotage the conference. I was there and witnessed 
journalists wasting the time of scientists in an IPCC session by focusing on the emails. 
Contrarian media continued these unsubstantiated attacks in an attempt to undermine trust 
in mainstream climate science. Much time was subsequently wasted responding to these 
attacks. 


Prof. K. Emanuel has an excellent summary of Climategate (ref). Read it. 


16 Contrarian Attacks on the IPCC: The three 2007 АВА IPCC climate reports totaled 
3,000 pages (and the 2013-14 ARS reports are longer). A few isolated and minor mistakes 
existed, notably one paragraph on glaciers in the 2007 Impacts WG2 report (not the 
Science WGI report). Contrarian media exploited this glitch to manufacture a destructive 
general attack on IPCC reports, the IPCC itself, and on climate science in general. 

These few minor errors should be contrasted with climate contrarian products, which 
are full of glaring errors that are ignored by contrarians and contrarian media. 
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demand unreasonable high statistical certainty to delay action on climate, 
denying the “precautionary principle"!?^!€?*»Y' They propagate the message 
"there is minimal or no climate risk”. Notice that the statement "no risk" has no 
uncertainty, and though contrarians (over) emphasize uncertainty in climate 
science, they oppose risk management including uncertainty. This is both 
inconsistent and wrong. Risk management is all about coping with uncertainty. 
The contrarian denial of climate tail risk management is quite like the 
irresponsible attitude of some traders before the 2008 market crash! 6^!6^ cii 


What about Limiting Government Action on Climate? 


The contrarian agenda of preventing government climate action and following 
Business As Usual will eventually backfire: 


161 Precautionary Principle; Contrarian Confidence Level Fallacy: Principle #15 of 
the Rio Declaration (ref) notes: "Jn order to protect the environment, the precautionary 
approach shall be widely applied by States according to their capabilities. Where there 
are threats of serious or irreversible damage, lack of full scientific certainty shall not be 
used as a reason for postponing cost-effective measures to prevent environmental 
degradation.” Uncertainty is no excuse to ignore climate issues and avoid responsibility. 
Attempts to measure “certainty” is done with a confidence level CL. CL = 95% 
means that a given event is outside 95% of the distribution of such events (random or 
non-random, measured or modeled). Some contrarians insist on a high CL in order to 
avoid action. This is the contrarain “we have to be certain before we act" fallacy with a 
mathematics veneer. The fallacy ignores the tail risks where the real danger lies 


162 Cheney’s *One-Percent Doctrine” and Action on Climate: The contrarian fallacy 
of demanding certainty as a prequisite for action is in direct contrast to the spirit of 
former V.P. Cheney’s “One-Percent Doctrine”: “If there's a 1% chance that Pakistani 
scientists are helping al-Qaeda build or develop a nuclear weapon, we have to treat it as 
a certainty in terms of our response. It's not about our analysis ... It's about our 
response.” Cheney is talking about risk tails and acting on important but unlikely threats. 
Suppose we replace “nuclear weapon” with “climate change” in the statement. 
Mainstream analyses point to a likely substantial (perhaps existential) climate change 
threat. Even if the climate threat were substantial but unlikely, the idea behind Cheney’s 
doctrine should be a forceful argument for climate action. For a direct analogy, the 
Defense Department states that climate change can increase terrorism (QDR2014, ref). 


d Story and Analogy: Traders, 2008, Tail Risk Management, Climate Contrarians: 
An influential head derivatives trader said to me, before 2008, as an “argument” against 
using my Stressed VAR that measures bad risk: *Look Jan, there hasn't been a recession 
since the beginning of modern finance". He wanted to argue that the institution didn't 
need more capital. He didn't want to consider tail risk management. His “argument” 
proved dead wrong with the 2008 recession. 

Denial of bad potential financial risk is directly analogous to contrarian denial of bad 
climate risk. Both are dangerous denials of potential tail risk events, discussed in the text. 


164 Neanderthals: It is said that the Neanderthals became extinct because they could not 
adapt to new conditions (ref). Draw your own conclusions. 
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e Effective action against climate risk will need to involve all sectors, from 
individuals, businesses and other NGOs, to all levels of government. 

e Ignoring climate risk will lead to increasingly severe climate impacts. 

e Increasing climate impacts will eventually require more government 
mitigation and regulation, plus increasingly expensive adaptation. 


What about a “Climate Solution” with "Pure Free-Market Energy”? 


Here are three points regarding contrarian obstruction to energy transition: 
e The fossil-fuel industry is not exactly a free market entity. Large direct 
and indirect subsidies have been given for many years, as noted above. 
e Contrarians want to hamper the renewable energy market, preventing its 
free market development ^" and directly impeding climate mitigation. 
e Some emerging technologies have had government assistance (e.g. 
railroads in the 19" century, DARPA and the Internet — see refs. supra). 


An Incomplete Tale: R. Earth and the Contrarian 


Climate change has increasingly made its way into the arts and popular culture’. 
Here is my attempt '°°. 

R. Earth appears to be sick. 100 reviewers give their diagnoses. 97 of the 100 
are specialists in the field, all holding permanent positions at universities and 
laboratories. These specialists give a consensus analysis, based on high 
temperature readings and other evidence, that R. Earth is indeed sick. 


e The 97 specialists conclude R. Earth's sickness could get much worse. 
e The 97 specialists recommend aggressive treatment starting now. 


The remaining 3 reviewers, replacing 3 dissenting specialists, loudly disagree. 


e Aretired engineer says he can prove the sickness will not get worse. 
e A political scientist blogs: “Treatment for the sickness is too expensive". 
e A climate contrarian, a Think Tank lawyer, denies R. Earth is sick at all. 


The 97 specialists write a summary for an international scientific journal. 
They conclude there is expert consensus: a real problem exists, will get worse if 
no action is taken, and sensible risk management action is appropriate. While 
there is discussion of the details, the consensus is that prompt treatment is 
feasible and much better than waiting without treatment. 


16 Even Comedy? See Jon Stewart’s skit on clips of Dr. J. Holdren, Presidential Science 
and Technology Adviser, responding to partisan contrarian attacks at a hearing of the HR 
Committee on Science, Space and Technology (ref, supra). 


166 Disclaimer: No reference to any particular person is intended. 
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The contrarian lawyer writes a paper issued by the Think Tank saying no 
consensus exists and that the specialists who concluded R. Earth is sick only have 
an "unprovable theory". He also says we should “wait to see what happens", and 
"if I am wrong we will eventually find out and we can do something then". 
Contrarian blogs support the paper and attack the international journal for not 
publishing it. The Think Tank gives the contrarian an award. 

Pseudo News TV has an interview with the contrarian. The host 1s hostile to 
the "alarmist unprovable theory". Another show presents a discussion with one of 
the specialists plus the contrarian "for balance". The contrarian lawyer is the 
better debater, interrupts the moderator, and gets more studio audience applause. 

A powerful national politician calls the diagnosis of R. Earth's sickness “a 
hoax spread by people in a conspiracy who are after government grants" and 
holds a committee hearing where he calls in the contrarian to testify as an expert 
science witness. He does not invite any of the specialists to testify. He also 
introduces a bill to defund government programs designed to help cure R. Earth. 

An investigation shows the politician receives campaign contributions from 
the same sources funding the Think Tank. The report appears as a short news 
column in the back pages of a major newspaper. 

The 97 experts decide to join an “Earth Risk Management Group" that holds 
an international conference. The Think Tank holds an opposition conference. 

R. Earth groans. 


Maybe you can help finish the story... 


932 


Quantitative Finance and Risk Management 


References 


i “There is no Planet B” 


Ban Ki-Moon, UN Secretary General, People's Climate March (9/21/2014): 
http://www.bbc.com/news/science-environment-29301969 


i CN Change Risk to Business and the Economy 


Rubin, R: "How ignoring climate change could sink the U.S. economy", 
Washington Post (7/24/2014): http://www.washingtonpost.com/opinions/robert- 
rubin-how-ignoring-climate-change-could-sink-the-us- 

economy/2014/07/24/b Tb4c00c-0df6-116e4-8341-b8072b16e7348 story.html 


Bloomberg, M.: *American Business Must Act to Reduce Climate Risk"; Wall 
Street Journal (6/17/2014): http://riskybusiness.org/blog/american-business- 
must-act-to-reduce-climate-risk 


Paulson, H. M. Jr., "The Coming Climate Crash - Lessons for Climate Change 
in the 2008 Recession ", NY Times, (6/21/2014): 
http://www.nytimes.com/2014/06/22/opinion/sunday/lessons-for-climate- 
change-in-the-2008-recession.html? r-0 


Litterman, R., “The Insanity of not assessing Climate risk", (interview, 2014): 
http://reneweconomy.com.au/2014/the-insanity-of-not-assessing-climate-risk- 
83657 


Lagarde, С: “IMF chief Lagarde warns of “merciless” climate change": 
http://www.rtcc.org/2014/02/05/imf-chief-lagarde-warns-of-merciless-climate- 
change/ 


“Climate Change Is A Global Mega-Trend For Sovereign Risk", S&P: 
https://www.globalcreditportal.com/ratingsdirect/renderArticle.do?articleId=13 1 
8252&SctArtId-236925&from-CM &nsl code-LIME&sourceObjectId-860681 
3&sourceRevid=1 &fee_ind=N&exp_date=202405 14-20:34:43 


ti TPCC ARS Climate Reports, Technical Summaries, Synthesis/SPM, AR4 Reports 


The Physical Science Basis (WG1) Report and Technical Summary (2013): 
http://www.ipcc.ch/report/ar5/wg1/ ; 
http://www.climatechange2013.org/images/report/WGI1ARS TS FINAL.pdf 


Impacts, Vulnerability, Adaption (WG2) Report and Technical Summary (2014): 
http://www.ipcc.ch/report/ar5/wg2/ ; http://ipec- 
wg2.gov/AR5/images/uploads/WGIIARS-TS FGDall.pdf 


Mitigation (WG3) Report and Technical Summary (2014): 
http://www.ipcc.ch/report/ar5/wg3/; http://report.mitigation2014.org/drafts/final- 
draft-postplenary/ipec wg3 ar5 final-draft postplenary technical-summary.pdf 


ARS Synthesis Report and SPM (2014): http:/Aipcc.ch/pdf/assessment- 
report/arS/syr/SYR ARS LONGERREPORT.pdf 


IPCC Principles and Procedures; ARS cutoff dates: 
http://www .ipcc.ch/organization/organization procedures.shtml; 
http://www .ipcc.ch/pdf/ar5/ar5-cut-off-dates.pdf 


ARA IPCC Reports (2007) and Nobel Prize: 
http://www.ipcc.ch/publications and data/ar4/syr/en/contents.html 


Chapter 53: Climate Change Risk Management 933 


Y Some General Resources on Climate Change 


Schneider, S., “An Overview of the Climate Change Problem”; *Mediarology; 
The Roles of Citizens, Journalists, and Scientists іп Debunking Climate 
Change Myths”; “Climate Change: Do We Know Enough for Policy Action? ”: 
http://stephenschneider.stanford.edu/Climate/Overview.html ; 
http://stephenschneider.stanford.edu/Mediarology/Mediarology.html ; Science 
and Engineering Ethics, Vol. 12, No. 4, pp. 607-636 (2006) 


“World Bank Group President Jim Yong Kim Remarks at Davos Press 
Conference”, World Bank (1/23/2014): 
http://www.worldbank.org/en/news/speech/2014/01/23/world-bank-group- 
president-jim-yong-kim-remarks-at-davos-press-conference 


Data Sources catalog: http://www.realclimate.org/index.php/data-sources/ 


National Greenhouse Gas Emissions Data (US), EPA (2014): 
http://www.epa.gov/climatechange/ghgemissions/usinventoryreport.html; 
http://www.epa.gov/climatechange/Downloads/ghgemissions/US-GHG- 
Inventory-2014-Chapter-Table-of-Contents.pdf 


Climate website links list (Climate Portal): http://climate.uu- 
uno.org/topics/view/51cbfc74f702fc2ba8129144/ 


Gore, A. “An Inconvenient Truth", Rodale Press (2006): 
http://en.wikipedia.org/wiki/An_Inconvenient_Truth_%28book%29 


Tolman, C. “Climate Change News”: http://tolmanccnews.blogspot.com/ 


Roston, E. “The Carbon Age: How Life's Core Element Has Become 
Civilization's Greatest Threat”, Walker Publishing Co., Inc. NY (2009): 
https://www.academia.edu/2010469/The_carbon_age how lifes core element 
has become civilizations greatest threat 


Fordham U. Conferences, Center Research Contemporary Finance (2012, 2010): 
http://www.bnet.fordham.edu/Finance Research Center/video E.html (video); 
http://www.scribd.com/doc/83777526/Fordham-Climate-Economics-and- 
Energy-Finance-Symposium-2012 (agenda); 
http://www.bnet.fordham.edu/Finance_Research_Center/conferences |.html 


" My History Related to Climate and Scientific Publications 


Managing Editor, Climate Portal; Matchmaker, Climate Science Rapid 
Response Team (CSRRT); Scientific publications: http://climate.uu-uno.org/ ; 
http://climaterapidresponse.org/ ; 
http://scholar.google.com/scholar?hl=en&q=Jan+W+Dash&btnG=&as_sdt=1%2 
C31&as_sdtp ; http://www.researchgate.net/profile/Jan_Dash/publications 


Е Resiliency Plans (example) 


NY City: http://www.climatecentral.org/news/new-york-launches-20-billion- 
climate-resiliency-plan-16106 


“ Climate Risk Management Perspective 


Emanuel, K., "Tail Risk vs. Alarmism", Climate Change National Forum 
(2014): http://climatechangenationalforum.org/tail-risk-vs-alarmism/ 


“i Abrupt Climate Change 


934 Quantitative Finance and Risk Management 


e “Abrupt Climate Change — Inevitable Surprises ", Richard B. Alley et al, 
Committee on Abrupt Climate Change, National Academy Press (2002): 
http://www.nap.edu/openbook.php?isbn=0309074347 


e Mayewski, P. et al, “West Antarctica’s Sensitivity to Natural and Human-forced 
Climate Change Over the Holocene”, J. Quat. Science 28(1) 40-4 (2013): 
http://www.geology.um.maine.edu/publications/Mayewski%20et%20al.%20201 
3.pdf 


* Animation Video for Increasing Historical Summer Temperatures 1951-2011 
e “Hemisphere Summer Temperature Anomalies, 1951-2011", NASA Scientific 
Visualization Studio: http://svs.gsfc.nasa.gov/cgi-bin/details.cgi?aid=3975 


* Figure for Increasing Temperature Risk 
e IPCC WGI 2001, Fig. 2.32: http://www.ipcc.ch/ipecreports/tar/wg 1/fig2-32.htm 


? Attitudes toward climate risk management 
e "Global Warming Six Americas", Yale U. Climate Change Communication 
(2012): http://environment.yale.edu/climate-communication/article/Six- 
Americas-September-2012 ; http://environment.yale.edu/climate- 
communication/filtered/?action=add_filter&fl=fl 


“ Climate Insurance 
e “The Cost of Delaying Action to Stem Climate Change”, Council of Economic 
Advisors, Office of the President, (2014): 
http://www.eenews.net/assets/2014/07/29/document cw 01.рағ 


e Principles for Sustainable Insurance, UNEP FI (2012), 
http://www.unepfi.org/psi/wp-content/uploads/2012/06/PSI-document.pdf 


х Catastrophic Risk Insurance Company Views on Climate Change (Munich Re) 
e Munich Re: http://www.munichre.com/en/group/focus/climate-change 


*" Climate Mitigation and “Insurance” 

e Pindyck, R. “Pricing Carbon When We Don't Know the Right Price", Energy & 
Environment, Regulation, Cato Institute (2013): 
http://object.cato.org/sites/cato.org/files/serials/files/cato- 
video/2013/6/regulation-v36n2-1-2.pdf 


* Climate Change Action and Ethics, Morality, Justice, Human Rights 
e "Climate Change: Summary and Recommendations to Governments” (quoted in 
text), NGO Committee on Sustainable Development (CoNGO, NY, 2014): 
https://www.scribd.com/doc/245315671/CSD-CoNGO-Climate-Change-Paper- 
Lima-2014-FINAL; CoNGO website: http://www.ngocongo.org/ 


e “Achieving Justice and Human Rights in an Era of Climate Disruption”, 
International Bar Association report (2014): 
http://www. ibanet.org/Document/Default.aspx? DocumentUid=0f8ceel 2-ee56- 
4452-bf43-cfcab196cc04 


e Women and Climate Change, Secretary's International Fund: U.S. Dept. State 
(2010): http://www.state.gov/documents/organization/141062.pdf ; 
http://www.state.gov/s/gwi/rls/other/2010/140805.htm 


Chapter 53: Climate Change Risk Management 935 


Hansen, J. “Storms of my Grandchildren”, Bloomsbury USA (2009): 
http://www.bloomsbury.com/us/storms-of-my-grandchildren-978 1608 195022/ 


Talbott, S. and Antholis, W., “Fast Forward: Ethics and Politics in the Age of 
Global Warming”, Revised Edition (2011), Chapter 1: 
http://www.brookings.edu/research/books/201 1/fastforwardrevised 


Dash, J. W., “Ethical Aspects of a Climate Treaty”, mp3 audio (UUSF, 2010): 
http://climate.uu-uno.org/files/177501 177600/177547/20101024jdsermon.mp3 


“Climate Ethics and Economics" (project): http://blogs.helsinki.fi/climate- 
ethics/2014/06/06/about/ 


The Rock Ethics Institute: http://sites.psu.edu/rockblogs/tag/climateethics/ 


Markowitz, E. and Shariff, A., *The moral case of climate change", Climate 
Science and Policy (2012): http://www.climatescienceandpolicy.eu/2012/09/the- 
moral-case-of-climate-change/ 


Utility Theory, Utilitarianism: http://en.wikipedia.org/wiki/Utility theory; 
http://en. wikipedia.org/wiki/Utilitarianism 


“i MDGs, Post-2015 Development Agenda, Sustainability, Environmental Justice 


Millennium Development Goals (MDGs): 
http://en.wikipedia.org/wiki/Millennium_Development_Goals; 


“UN Post-2015 Development Agenda”, CONGO (2014): 
http://ngocongo.org/newsletters_2014/Communication%20June%202014.pdf 


“Review of implementation of Agenda 21 and the Rio Principles ", Sustainable 
Development in the 21“ century (SD21), Synthesis (2012): 
https://sustainabledevelopment.un.org/content/documents/641Synthesis_report_ 
Web.pdf 


Sustainability: http://en.wikipedia.org/wiki/Sustainability#Definition 
SASB: Sustainability Accounting Standards Board: http://www.sasb.org/ 
Environmental Justice - EPA: http://www.epa.gov/environmentaljustice/ 


Environmental Justice - Greenfaith: 
http://www.greenfaith.org/programs/environmental-justice 


“Our Place in the Web of Life", curriculum, UU Ministry for Earth: 
http://tipsheet.blogs.uua.org/resources/environmental-curriculum-for- 
congregations-ready/ 


Coal - Environmental Impacts (Wikipedia): 
http://en.wikipedia.org/wiki/Environmental impact of the coal industry 


*! Population Distribution and Energy Use 


Statistics: http://en.wikipedia.org/wiki/World population 
Energy Consumption: http://en.wikipedia.org/wiki/World energy consumption 


хий Copenhagen Accord 


Copenhagen Climate conference, UNFCCC cop 15 (2009): 
http://unfccc.int/files/meetings/cop 15/application/pdf/copl5 cph auv.pdf 


** Stranded Assets, the Carbon Bubble, and Risks to the Economy 


936 


Quantitative Finance and Risk Management 


Blood, D., Gore, A: “The Coming Carbon Asset Bubble”, Wall Street Journal, 
(10/29/2013):http://www.wsj.com/articles/SB 10001424052702304655 10457916 
3663464339836 


“Wasted capital and Stranded Assets” (2013); “Carbon Bubble — Unburnable 
Carbon” (2012), Carbon Tracker reports: 
http://www.carbontracker.org/report/wasted-capital-and-stranded-assets/; 
http://www.carbontracker.org/report/carbon-bubble/ 


“Carbon bubble will plunge the world into another financial crisis”, Guardian 
(4/19/2013): http://www.guardian.co.uk/environment/2013/apr/19/carbon- 
bubble-financial-crash-crisis 


** Carbon Risk Valuation Tool 


Bloomberg LP (2013): http://www.bloomberg.com/now/2013-12- 
05/introducing-our-carbon-risk-valuation-tool/ 


™ Carbon tax — relationship to gasoline price increase 


Citizens Climate Lobby (cf. FAQ - energy prices): 
http://citizensclimatelobby.org/carbon-fee-and-dividend/ 


xi Business Legal and Reputation Risks 


“Climate Change Risk Perception and Management: A Survey of Risk 
Managers”, CERES (2010): http://www.ceres.org/resources/reports/risk- 
manager-survey/view 


“Risk and reputation in the age of disruption", Allianz Risk Barometer Top 
Business Risks 2015: http://www.agcs.allianz.com/assets/PDFs/Reports/Allianz- 
Risk-Barometer-2015 EN.pdf 


Center for Climate Change Law, Columbia U.: 
http://web.law.columbia.edu/climate-change 


Gerrard, M. et al, “Climate Change Litigation in the US", Arnold & Porter LLP: 
http://www.arnoldporter.com/resources/documents/ClimateChangeLitigationCha 
rt.pdf 


Our Children's Trust: http://ourchildrenstrust.org/atl 


*Fiduciary duty & climate change disclosure", Climate Disclosure Standards 
Board CDSB: http://www.cdsb.net/fiduciarystatement/statement 


Public Trust Doctrine: http://www.law.cornell.edu/wex/public trust doctrine 


[International Trade Agreements and Environmental Laws]: Knox, J. H., “The 
Judicial Resolution of Conflicts between Trade and the Environment", Harvard 
Environmental Law Review, Vol 28 (2004): 
http://www.law.harvard.edu/students/orgs/elr/vol28_1/knox.pdf 


Lim, A., “How First Nations in Canada Are Winning the Fight Against Big Oil’, 
The Nation (9/29/14): http://www.thenation.com/article/181571/how-first- 
nations-canada-are-winning-fight-against-big-oil# 


х Investor Legal Action 


Covington, H. and Thamotheram, R., “The Case for Forceful Stewardship (Part 
2): Managing Climate Risk", Cambridge U. working paper (2015): 
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2551485 


ху The Oil Industry — former and current views on Climate Change Risk (sample) 


Chapter 53: Climate Change Risk Management 937 


“Exxon Secrets” (Greenpeace): 
http://www. greenpeace.org/usa/en/campaigns/global-warming-and- 
energy/exxon-secrets/ 


“Energy and Carbon — Managing the Risks”, Exxon Mobil: 
http://cdn.exxonmobil.com/~/media/F iles/Other/20 14/Report%20- 
%20Energy%20and%20Carbon%20-%20Managing%20the%20Risks.pdf 


“ExxonMobil’s views and principles on policies to manage long-term risks from 
climate change": http://corporate.exxonmobil.com/en/current-issues/climate- 
policy/climate-policy-principles/overview 


Exxon Mobil Carbon Disclosure Project Report (2013): 
http://cdn.exxonmobil.com/~/media/Reports/Other%20Reports/2013/cdp_invest 
or 2013.pdf 


Shell Oil: http://www.shell.com/global/environment- 
society/environment/climate-change.html 


“Engaging with oil companies on climate change is futile, admits leading UK 
environmentalist’, The Guardian (1/15/2015): 
http://www.theguardian.com/environment/2015/jan/15/engaging-with-oil- 
companies-climate-change-futile-admits-leading-environmentalist 


ХУ Social Cost of Carbon 


“Social Cost of Carbon...", EPA, Technical Support Document (2010): 
http://www.epa.gov/oms/climate/regulations/scc-tsd.pdf 


“Social cost of carbon six times higher than thought — study”, RTCC 
(1/12/2015): http://www.rtcc.org/2015/01/12/social-cost-of-carbon-six-times- 
higher-than-thought-study/ 


Daniel, D., Litterman, R., and Wagner, G., “Applying Asset Pricing Theory to 
Calibrate the Price of Climate Risk” (2015): 
http://www.kentdaniel.net/papers/unpublished/DLW_climate-20150201 .pdf 


Weitzman, M., “Fat Tails and the Social Cost of Carbon”, American Economic 
Review: Papers & Proceedings 104(5): 544—546 (2014): 
http://scholar.harvard.edu/files/weitzman/files/aer. 104.5 .544fattailsandthesocialc 
ostofcarbon.pdf 


Tol, R., “The Social Cost of Carbon: Trends, Outliers and Catastrophes ", 
Economics (2008): http://www.economics- 
ejournal.org/economics/Journalarticles/2008-25/version 1/at download/file 


“i Value at Risk and Climate Change 


“Value at Risk: Climate Change and the Future of Governance", CERES 
Sustainable Governance Project (2002): 
http://www.ceres.org/resources/reports/value-at-risk-climate-change-and-the- 
future-2002/view 


Caldecott, B., “Stranded assets, environment-related risks, and agriculture ", 
Smith School of Enterprise and the Environment, U. Oxford (2013): 
http://www.smithschool.ox.ac.uk/research/stranded- 
assets/Stranded%20Assets%20A griculture%20Report%20Final.pdf 


“Natural Capital at Risk — The Top 100 Externalities of Business", Trucost, 
TEEB for Business Coalition (2013): http://www.trucost.com/published- 
research/99/natural-capital-at-risk-the-top- 100-externalities-of-business 


938 Quantitative Finance and Risk Management 


e Covington, H. and Thamotheram, R., “The Case for Forceful Stewardship (Part 
1): The Financial Risk from Global Warming”, Cambridge U. working paper 
(2015): http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2551478 


xxvii USCAP 
e (САР: http://www.us-cap.org/about-us/our-report-a-call-for-action/ ; 
http://www.us-cap.org/USCAPCallForAction.pdf 


xxviii CERES 
e CERES website: http://www.ceres.org/about-us/coalition 


*** “Climate Smart Development” 

e ©2014 International Bank for Reconstruction and Development/The World 
Bank and Climate Works Foundation: 
http://documents.worldbank.org/curated/en/2014/06/19703432/climate-smart- 
development-adding-up-benefits-actions-help-build-prosperity-end-poverty- 
combat-climate-change-vol-1-2-main-report 


** Deutsche Bank and Climate Change Opportunities 
e “Deutsche Bank's Asset Management division publishes major climate change 
research - Indicates investing in climate change can help stimulate economies " 
(2008): https://www.db.com/medien/en/content/press releases 2008 4143.htm 


хі Bloomberg L.P. Sustainability 
e “Good for Business, Good for the Planet" (2013): 
http://www.bloomberg.com/now/2013-07-24/bloomberg-sustainability-good- 
for-business-good-for-the-planet/ 


i The Free-Rider Problem 
e Wikipedia: http://en.wikipedia.org/wiki/Free_rider_problem 


"5 T ong Term Rates 
e U.S. Treasury (accessed 4/11/2014): http://www.treasury.gov/resource- 
center/data-chart-center/interest-rates/Pages/TextView.aspx?data=realyield 


“xiv Economics, Economic Models and Discount Rates 
e Nordhaus, W., “the Climate Casino — Risk, Uncertainty, and Economics for a 
Warming World", Yale University Press (2013). 


e Тһе Taylor Rule: http://en.wikipedia.org/wiki/Taylor rule 


e Economics of Global Warming: 
http://en.m.wikipedia.org/wiki/Economics_of_global_ warming 


e Weighted Average Cost of Capital: 
http://en.m.wikipedia.org/wiki/Weighted_average_cost_of_capital 


e Ramsey Formula: http://en.m.wikipedia.org/wiki/Stern_Review#Discounting 


“Y Critique of Economic Models of Climate change 
e DeCanio, S., “Economic Models of Climate Change, A Critique", Palgrave 
MacMillan (2003): www.stephendecanio.com/ 


Chapter 53: Climate Change Risk Management 939 


Xavi Collateral Credit Support Annex 
e Wikipedia: https://en.wikipedia.org/wiki/Credit Support Annex 


* i Negative Discount Rate Model 
e Weitzman, M., "The Ramsey Discounting Formula for a Hidden-State 
Stochastic Growth Process", Environ Resource Econ 53; Pp. 308-321 (2012) 


"ii The “Ultimate Forward Rate”, Solvency II, QIS 5 
e “The Ultimate Forward Rate: Implications for LDI Strategies”, PIMCO (2012): 
http://europe.pimco.com/EN/Insights/Pages/The-Ultimate-Forward-Rate- 
Implications-for-LDI-Strategies.aspx 


e Тһе Solvency II Directive: http://en.wikipedia.org/wiki/Solvency II Directive 


ә (155 Technical Specifications: 
http://ec.europa.eu/internal market/insurance/docs/solvency/qis5/201007/techni 
са! specifications en.pdf 


“<i Ethics for future generations and the discount rate for climate impacts 
e Varian, H., “Recalculating the Costs of Global Climate Change", NYT (2006): 
http://people.ischool.berkeley.edu/-hal/people/hal/N Y Times/2006-12-14.html 


х Negative Interest Rates 
e “Negative Policy Interest Rates: Should the SNB Consider Them? ", IMF 
Country Report No. 13/129 (2013): 
http://www .imf.org/external/pubs/ft/scr/2013/cr13129.pdf 


“i Finance Risk Case Study 
e Wilmarth, A. Jr., “Citigroup: A Case Study in Managerial and Regulatory 
Failures”, 47 IND. L. REV. pp. 69-137 (2013): 
http://scholarship.law.gwu.edu/cgi/viewcontent.cgi?article=2234&context=facul 
ty_publications 


xli Climate Instabilities and Economic Instabilities 
e Hall, D. and Behl, R., “Integrating economic analysis and the science of climate 
instability", Ecological Economics 57, 442-465 (2006): 
http://www.sciencedirect.com/science/article/pii/S092 1800905002417 


35 Risky Business 
e Risky Business: http://riskybusiness.org/ ; http://riskybusiness.org/blog 
e Bloomberg, M., Paulson, H., Steyer, T., “The Economic Risks of Climate 
Change in the United States” (10/3/2013): http://riskybusiness.org/blog/we- 
need-climate-change-risk-assessment 


e Risky Business - Full Report (2014): 
http://riskybusiness.org/uploads/files/RiskyBusiness PrintedReport FINAL W 
EB OPTIMIZED.pdf 


e Hauser, T et al, “American Climate Prospectus: Economic Risks in the United 
States", Rhodium Group (2014): http://rhg.com/reports/climate-prospectus 
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e Kopp, R., Hsiang, S. et al, Rhodium Group Technical Report — Input to the Risky 
Business Project (2014): http://rhg.com/wp- 
content/uploads/2014/08/RHG_AmericanClimateProspectus_v1.1.pdf 


e Bloomberg Philanthropies: http://www.bloomberg.org/ ; The Next Generation: 
http://thenextgeneration.org 


id Supply Line and other Business Risks; Corporate Action on Climate Change 


Supply Chain Climate Risk Management, Four Twenty Seven: 
http://427mt.com/services/climate-change-risk-management/supply-chain- 
climate-risk-management/ 


e “Severe Weather and Manufacturing in America: Comparing the cost of 
Droughts, Storms and Extreme Temperatures with the cost of new EPA 
Standards”, Business Forward Foundation report (2014); 
http://www.businessfwd.org/blog/report-severe-weather-and-manufacturing-in- 
america 


e “Industry Awakens to Threat of Climate Change”, NY Times (1/23/2014) 
http://www.nytimes.com/2014/01/24/science/earth/threat-to-bottom-line-spurs- 
action-on-climate.html? r-0 


e “Big Business Working on Climate Change”, U.S. News and World Report: 
(8/5/2014): http://www.usnews.com/news/blogs/at-the-edge/2014/08/05/wal- 
mart-ibm-and-coke-among-companies-addressing-climate-change 


e Reynolds, B., “Climate Money Policy": 
http://www.climatemoneypolicy.squarespace.com/latest/ 


xV Sustainability Officer 


e Wikipedia: http://en.wikipedia.org/wiki/Chief sustainability officer 


** Environmental, Social, Governance (ESG) Ratings and Indices 


e Bloomberg LP: http://www.bsr.org/en/our-insights/bsr-insight- 
article/bloomberg-launches-esg-data-service 


xlvii Climate Bonds, Green Bonds, Securitized ABS 


e Climate Bonds: http://www.climatebonds.net/#sthash.dj XU6k6I.dpuf 


e Green Bond Principles: http://www.jpmorganchase.com/corporate- 
responsibility/green-bonds 


e  "SolarCity Securitization Deal Jolts Solar Energy Industry", TheStreet 
(11/22/2013): http://www.thestreet.com/story/12121215/l/solarcity- 
securitization-deal-jolts-solar-energy-industry.html 


xvii Sustainable Energy Financing 


e International Institute for Sustainable Development (iisd) report (2014): 
http;//climate-l.11sd.org/news/July-2014-sustainable-energy-finance-update/ 


e “Climate Finance", World Bank: 
http://www.worldbank.org/en/topic/climatefinance 


Хх Businesses and Renewable Energies 


e “Sustainable Energy in America, 2014", Bloomberg Finance LP, Business 
Council for Sustainable Energy: 
http://www.bcse.org/sustainableenergyfactbook.html 
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e "Power Forward 2.0 — How American Companies are Setting Clean Energy 
Targets and Capturing Greater Business Value"; CERES, WWF, Calvert 
Investments, DGA report (2014): 
http://www.ceres.org/resources/reports/power-forward-2.0-how-american- 
companies-are-setting-clean-energy-targets-and-capturing-greater-business- 
value 


e Robertson, J.; “Building a Green Economy” (2010): 
http://www. geoversiv.com/building-a-green-economy/ 


e “Businesses Invest in Climate Action — CEOs plan for a climate change future" , 
Climate Nexus, (2014): http://climatenexus.org/wp- 
content/uploads/2014/08/BusinessesAndClimateChange.pdf 


e IRENA (International Renewable Energy Agency): costs; jobs and renewables: 
http://costing.irena.org/irena-renewable-costing-alliance.aspx ; 
http://www. irena.org/DocumentDownloads/Publications/IRENA RE Jobs Ann 
ual Review 2015.pdf 


e  Desertec (gigantic solar project in North Africa): http://www.dii- 
eumena.com/fileadmin/flippingbooks/dp2050 exec sum engl web.pdf 


e “Report: Rooftop solar already cheaper than utility rates in most major cities", 
Charlotte Business Journal (1/8/2015): 
http://m.bizjournals.com/charlotte/blog/energy/2015/01/report-rooftop-solar- 
already-cheaper-than-utility.html?r-full 


Companies Planning including a Carbon Price 
e "Large Companies are Prepared to Pay Price on Carbon", NY Times, 
(12/05/2013): http://www.nytimes.com/2013/12/05/business/energy- 
environment/large-companies-prepared-to-pay-price-on-carbon.html 


e “Use of internal carbon price by companies as incentive and strategic planning 
tool — a review of findings from CDP 2013 disclosure", CDP (2013): 
https://www.cdp.net/CDPResults/companies-carbon-pricing-2013.pdf 


e CDP Global 500 Climate Change Report 2013, “Sector insights: what is driving 
climate change action in the world's largest companies?”’: 
https://www.cdp.net/cdpresults/cdp-global-500-climate-change-report-2013 .pdf 


Н Business Environmental Leadership Council (C2ES) 
e BELC: http://www.c2es.org/business/belc 


t World Economic Forum Report 
e Report: http;//www.weforum.org/reports/global-risks-2013-eighth-edition; p. 20 


tii United Nations Environment Programme - Finance Initiative (UNEP FI) 
e “Finance and Conflict, UNEP FI: http://www.unepfi.org/work-streams/finance- 
and-conflict/ ; http://www.unepfi.org/publications/finance-and-conflict/ 


НУ Natural Capital Declaration (NCD) 
e NCD document (2013): http://www.naturalcapitaldeclaration.org/wp- 
content/uploads/2013/10/NCD-booklet-English.pdf 


Rio + 20 Conference (2012), Rio Conference (Earth Summit 1992) 
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e Wikipedia:http://en.wikipedia.org/wiki/United. Nations Conference on Sustain 
able Development; http://en.wikipedia.org/wiki/Earth Summit 


Ы Banking Environment Initiative, Cambridge Inst. Sustainability Leadership 
e BEI website: http://www.cisl.cam.ac.uk/Business-Platforms/Banking- 
Environment-Initiative.aspx 


e CISL website: http://www.cisl.cam.ac.uk/ 


Mi World Business Council for Sustainable Development 
e WBCSD website: http://www.wbcsd.org/about.aspx 


Mii Shareholder Resolutions 
е CSR Reporting (6/25/2014): http://www.csrreporting.com/record-number-of- 
sustainability-related-shareholder-resolutions-filed-in-20 1 4-2/ 


ix Socially Responsible Investing 
e Wikipedia: http://en.wikipedia.org/wiki/Socially_responsible_investing 


* Carbon Pricing and Renewable Energy Investing 
e “Ernst & Young: Carbon price uncertainty undermining investment’, Climate 
Spectator (2013): http://www.businessspectator.com.au/news/2013/8/20/carbon- 
markets/ernst-young-carbon-price-uncertainty-undermining-investment 


e Climate Policy Initiative: “Climate Change, Investment and Carbon Markets 
and Prices — Evidence from Manager Interviews" (2011): 
http://climatepolicyinitiative.org/wp-content/uploads/201 1/12/Climate-Change- 
Investment-and-Carbon-Markets-and-Prices.pdf 


e = Investor Words: http://www.investorwords.com/7024/break even time.html 


e IRS and renewable energy tax credits (The Hill 8/11/2014, and IRS): 
http://thehill.com/policy/energy-environment/2 14823-irs-guidance-relaxes- 
renewable-energy-tax-credit ; https://www.novoco.com/energy/retc/irsguide.php 


xi Divestment Resolutions (examples) 
e United Church of Christ (2013): http://www.ucc.org/news/GS2013-fossil-fuel- 
divestment-vote.html 


e Unitarian Universalist Association (2014): 
http://www.uua.org/news/pressroom/pressreleases/296102.shtml 


e World Council of Churches (2014): 
http://www.theguardian.com/environment/2014/jul/1 1/world-council-of- 
churches-pulls-fossil-fuel-investments 


Б Basel Ш, Regulation and Climate Risk 
e Basel Ш: http://en.wikipedia.org/wiki/Basel III 


e “Stability and Sustainability in Banking Reform: Are Environmental Risks 
Missing in Basel III?", CISL & UNEP FI (2014): 
http://www.unepfi.org/fileadmin/documents/StabilitySustainability.pdf 


xii SEC disclosure guidelines 
e SEC - climate change disclosure: http://www.scribd.com/doc/142348325/SEC- 
Climate-Change-Disclosure-Overview-CRS 
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у Stern Review and related work 
е Stern Review (2008): http://en.wikipedia.org/wiki/Stern_review ; 
http://www.guardian.co.uk/environment/2008/jun/26/climatechange.scienceofcli 
matechange 


е Dietz, S. and Stern, N.: “Endogenous growth, convexity of damages and climate 
risk: how Nordhaus’ framework supports deep cuts in carbon emissions”. Centre 
Climate Change Economics and Policy, Grantham Research Institute (6/2014): 
http://www.lse.ac.uk/GranthamInstitute/wp-content/uploads/2014/06/Working- 
Paper-180-Dietz-and-Stern-2014.pdf 


у Climate Impacts, Economic Models, Mitigation Policy (examples) 
e Tol, R., “Correction and Update: The Economic Effects of Climate Change", 
Table 1, Fig. 2 (2014): http://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.28.2.221 


e Ward, B., “Errors in estimates of the aggregate economic impacts of climate 
change — Part IIT’, LSE/Grantham Institute (2014): 
http://www.lse.ac.uk/GranthamInstitute/news/errors-in-estimates-of-the- 
aggregate-economic-impacts-of-climate-change-part-ii/ 


e Moore, F. and Diaz, D., “Temperature impacts on economic growth warrant 
stringent mitigation policy", Nature climate change 2481 (1/12/2015): 
http://www.nature.com/nclimate/journal/vaop/ncurrent/full/nclimate248 1 .html 


e “Key Economic Sectors and Services", ARS WGH IPCC (2014): http://ipcc- 
wg2.gov/ARS5/1mages/uploads/WGIIARS5-Chap10  OLSM.pdf ; “The limits of 
the economic assessment of climate change risks": ARS Synthesis Report, Box 
3.1, p. 35 (2014), supra. 


e  DeCanio, S., “Economic Models of Climate Change – A Critique", Palgrave 
Macmillan (2003): http://stephendecanio.com/ 


e Kaya Identity: https://en.wikipedia.org/wiki/Kaya_identity 


"i REMI Economic Model on Carbon Fee and 100% Dividend 
e “The Economic, Climate, Fiscal, Power, and Demographic Impact of a National 
Fee-and-Dividend Carbon Tax", Regional Economic Models, Inc. (КЕМІ): 
http://citizensclimatelobby.org/wp-content/uploads/2014/06/REMI-carbon-tax- 
report-62141.pdf ; http://citizensclimatelobby.org/wp- 
content/uploads/2014/06/REMI-National-SUMMARY .pdf 


e Scott Nystrom, REMI Senior Economic Associate (private communication) 


wi EPA Climate/Economic Models 
e EPA: http://www.epa.gov/climatechange/economics/index.html ; 
http://www.epa.gov/climatechange/economics/modeling.html#minicam 


xvii IMF Economic/Climate Models 
e “Climate Change and the Global Economy”, Ch. 4, World Economic Outlook, 
IMF (2008): http://www.imf.org/external/pubs/ft/weo/2008/01/pdf/c4.pdf; 
“Climate, Environment, and the IMF” IMF Factsheet (9/2014): 
http://www.imf.org/external/np/exr/facts/enviro.htm 


xx «We Still have a Long Way to Go” 
e King Melchior, “Атал! and the Night Visitors", С. C. Menotti (С. Schirmer, ed. 
1997, p. 22); at 7:07 min on: https://www. youtube.com/watch?v=8 YkgGAIa7nA 
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х APPENDIX I (CLIMATE SCIENCE): 


“Yes, Virginia”, Newseum: http://www.newseum.org/exhibits/online/yes- 
virginia/ 


xxi Climate Science Update, Temperature data, CO, emissions, Troposphere, SSA 


“Climate Risk: An Update on the Science", Met Office, Hadley Center (2014): 
http://www.metoffice.gov.uk/media/pdf/o/p/Climate risk an update on the sci 
ence.pdf 


Cowtan, K., “Temperature Trends” (Data from different sources that can be 
plotted for different time periods and with different moving averages, with 
trends and uncertainties): 
http://www.ysbl.york.ac.uk/-cowtan/applets/trend/trend.html 


Global Temperatures since 1880 — NASA (GISS) Graph and numbers: 
http://data.giss.nasa.gov/gistemp/graphs v3/ ; 
http://data.giss.nasa.gov/gistemp/tabledata_v3/GLB.Ts+dSST. txt 
Hansen, J. et al, “Global Temperature in 2014 and 2015", Earth Institute 


Columbia U. (1/16/2015): http://csas.ei.columbia.edu/2015/01/16/global- 
temperature-in-201 4-and-2015/ 


Quality assurance and homogeneity adjustments (NOAA) for temperature data: 
http://www.ncdc.noaa.gov/ghenm/v3.php?section-quality assurance ; 
http://www.ncdc.noaa.gov/ghenm/v3.php?sectionchomogeneity adjustment 


Global Carbon Project; Global Carbon Budget: 
http://www.globalcarbonproject.org/carbonbudget/14/hl-compact.htm 


СО» emissions (2013): http://www. globalcarbonproject.org/carbonbudget/13/hl- 
full.htm#summary 


“Tropical Tropospheric Trends”, RealClimate (2007): 
http://www.realclimate.org/index.php/archives/2007/12/tropical-troposphere- 
trends/ 


Troposphere Warming and Stratosphere Cooling, Espere: 
http://www.atmosphere.mpg.de/enid/20c.html 


Rahmstorf, S. and Foster, G., “Global temperature evolution 1979-2010”, 
Environ. Res. Lett. 6 (2011): http:/Aopscience.iop.org/1748-9326/6/4/044022 


Singular Spectrum Analysis SSA and MSSA and climate data: Ghil, M. et al., 
"Advanced spectral methods for climatic time series", Geophys. 40(1), 3.1—3.41 
(2002): http://web.atmos.ucla.edu/tcd//PREPRINTS/2000RG.pdf 


Bloomberg Carbon Clock: http://www.bloomberg.com/graphics/carbon-clock/ 


Dash, J. W., Zhang, Y., Miglozzi, B., Roston, E., “A Forecast for Global CO; 
Levels ", Bloomberg LP working paper (2015): 
http://www.bloomberg.com/graphics/carbon-clock/BLOOMBERG-CARBON- 
CLOCK-TECHNICAL-WORKING-PAPER-12-01-15.pdf 


Permafrost carbon cycle: http://en.wikipedia.org/wiki/Permafrost carbon cycle 


ой The “Hockey Stick" 


Wahl, E. R. and Ammann, C. M., *Robustness of the Mann, Bradley, Hughes 
reconstruction of Northern Hemisphere surface temperatures: Examination of 
criticism ...", Climate Change Vol. 85, pp. 33-69, (2007): 
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http://www.rap.ucar.edu/projects/rc4a/millennium/refs/Wahl ClimChange2007. 
pdf 
Kaufman, D. et al, "Continental-scale temperature variability during the past 


two millennia”, Nature Geoscience 6, 339—346 (2013): 
http://www.nature.com/ngeo/journal/v6/n5/full/ngeo1797.html 


Temperature Reconstructions figure: IPCC WG1 ARA, 2007): 
http://www. ipcc.ch/graphics/ar4-weg 1 /jpg/fig-6-10.jpg 


>i The Holocene, Present Temperature, and Human Civilization 


Hansen, J. et al: “Target Atmospheric CO2: Where Should Humanity Aim?” 
(Origin of 350 ppm CO;), The Open Atmospheric Science Journal 2, 217-231 
(2008): http://www.columbia.edu/~jeh1/2008/TargetCO2_20080407.pdf 


Marcott, S. et al, “4 Reconstruction of Regional and Global Temperature for the 
Past 11,300 Years”, Science Vol. 339, pp. 1198-1201, (2013): 
http://m.sciencemag.org/content/339/6124/1198.abstract 


CO; over last 10,000 years IPCC WG1 ARA, 2007): 
http://www. ipcc.ch/graphics/ar4-wg1/jpg/spm1 jpg 


i E] Nifio/La Niña, the “Hiatus”, Models, Energy Imbalance, the PDO, Chaos 


El Nifio/La Niña (NOAA): http://www.elnino.noaa.gov/index.html 


Chen, X. and Tung, K., “Varying planetary heat sink led to global-warming 
slowdown and acceleration", Science Vol. 345 no. 6199 pp. 897-903 (2014): 
http://m.sciencemag.org/content/345/6199/897 


Meehl, G. et al, “Climate model simulations of the observed early-2000s hiatus 
of global warming”, Nature Climate Change, doi:10.1038/nclimate2357 (2014): 
http://www.nature.com/nclimate/journal/vaop/ncurrent/full/nclimate2357. html 


Kosaka, Y., Xie, S.P., “Recent global warming hiatus tied to equatorial Pacific 
surface cooling”, Nature 501, 403-407, (2013): 
http://www.nature.com/nature/journal/v501/n7467/full/nature12534.html 


Comments on the “hiatus”, RealClimate (2013): 
http://www.realclimate.org/index.php/archives/2013/04/the-answer-is-blowing- 
in-the-wind-the-warming-went-into-the-deep-end/ 


Trenberth, K. et al, *Earth's Energy Imbalance”, Journal of Climate, Vol 27, p. 
3129 (2014): http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-13-00294.1 


Trenberth, К. et al., “An apparent hiatus in global warming?” (PDO, 2014): 
http://onlinelibrary.wiley.com/doi/10.1002/2013EF000165/abstract 


“Study — Volcanoes contribute to recent warming ‘hiatus’”, MIT (2014): 
http://newsoffice.mit.edu/2014/study-volcanoes-contribute-to-recent-warming- 
hiatus-0223 


Huber, M. and Knutti, R., “Natural variability, radiative forcing and climate 
response in the recent hiatus reconciled” (Solar fluctuations, volcanic aerosols, 
model/data consistency), Nature Geoscience 7, 651-656 (2014): 
http://www.nature.com/ngeo/journal/v7/n9/full/ngeo2228 html 


Tollefson, J., “Climate change: The case of the missing heat", Nature 505, 276— 
278 (1/15/2014): http://www.nature.com/news/climate-change-the-case-of-the- 
missing-heat-1.14525 


Chaos and Weather vs. Climate: Realclimate (2005) 
http://www. realclimate.org/index.php/archives/2005/1 1/chaos-and-climate/ 
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"v Attribution of Recent Global Warming to Human Activity 


10 Indicators of a Human Fingerprint on Climate Change, Skeptical Science 
(2010): http://www.skepticalscience.com/10-Indicators-of-a-Human- 
Fingerprint-on-Climate-Change.html 


Attribution and Models: The Physical Science Basis (WG1), supra. 


=! Consensus on Mainstream Climate Science by the experts 


*The Consensus Project", Skeptical Science: 
http://skepticalscience.com/tcp.php?t=home 


Doran, P. and Kendall Zimmerman, M., “Examining the Scientific Consensus on 
Climate Change", EOS Vol 90 (2009): 
http://tigger.uic.edu/~pdoran/012009_ Doran final.pdf 


“Climate Graphics — 97 Hours of Consensus ", Skeptical Science: 
http://www.skepticalscience.com/graphics.php?c=9 


List of around 200 “Scientific Organizations That Hold the Position That 
Climate Change Has Been Caused by Human Action", State of California 
(2011): http://opr.ca.gov/s_listoforganizations.php 


ii Climate Science Computer Models — FAQ, Projections, Forcings, Scenarios 


RealClimate FAQ on Climate Models (2008, 2009): 
http://www.realclimate.org/index.php/archives/2008/1 1/faq-on-climate-models/ 
http://www.realclimate.org/index.php/archives/2009/01/faq-on-climate-models- 
part-ii/ 

Pierrehumbert, R., *Tyndall Lecture: Successful Predictions", video of talk, 
American Geophysical Union (AGU) meeting (2012): 
https://www.youtube.com/watch?v-RICBu P8JWI 


Rahmstorf, S. et al, *Comparing climate projections to observations up to 
2011", Environ. Res. Lett. 7 (2012): http://m.iopscience.iop.org/1748- 
9326/7/4/044035/pdf/1748-9326 7 4 044035.pdf 


“Evaluation of Climate Models", IPCC ARS WG1 (2013): 
http://www.climatechange2013.org/images/report/WGl1 ARS Chapter09 FINAL 
.pdf 

Climate Model Forcings: http://data.giss.nasa.gov/modelforce/ ; 

http://www .ipcc.ch/graphics/ar4-wg l/]pg/spm2.jpg 

Model Forecasting with Scenarios — figure, IPCC WG1 ARA (2007): 
http://www.ipcc.ch/graphics/ar4-wgl/]pg/spm5.]pg 

Mendez, M. et al, *Climate Models: Challenges for Fortran Development 
Tools", 2014 Second International Workshop on Software Engineering for High 


Performance Computing in Computational Science & Engineering: 
http://conferences.computer.org/sehpccse/2014/papers/7035a006.pdf 


tvii Climate Sensitivity to CO; increase (ECS) 


Kummer, J.R. and Dessler, A. E., “The impact of forcing efficacy on the 
equilibrium climate sensitivity", Geophysical Research Letters (5/23/2014): 
http://onlinelibrary.wiley.com/doi/10.1002/2014GL060046/abstract ; video: 
http://youtu.be/Dr2D8xTQZSM 


Lunt, D.J. et al, “Earth system sensitivity inferred from Pliocene modelling and 
data". Nature Geosci., 3 (2010): http://pubs. giss.nasa.gov/abs/lu07000g.html 
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e Knutti, R. et al, "The equilibrium sensitivity of the Earth's temperature to 
radiation changes ", Nature Geoscience 1, 735 - 743 (2008): 
http://www.nature.com/ngeo/journal/v1/n1 1/abs/ngeo337.html 


e = Mann, M., “Earth Will Cross the Climate Danger Threshold by 2036; The rate 
of global temperature rise may have hit a plateau, but a climate crisis still looms 
in the near future", Scientific American (3/18/2014): 
http://www.scientificamerican.com/article/earth-will-cross-the-climate-danger- 
threshold-by-2036/ 


*** Impacts of Climate Change Locally on Cities 
e “Shifting Cities: 1,001 Blistering Future Cities", Climate Central Report (2014): 
http://assets.climatecentral.org/pdfs/ShiftingCitiesAnalysis.pdf 


e “Climate Risk Information 2013”, NY City Panel on Climate Change: 
http://ccrun.org/NPCC-2013 


e “Climate Change Scenarios and Downscaled Climate Projections” (Boston, 
New York, Philadelphia), CCRUN (Consortium for Climate Risk in the Urban 
Northeast): http://ccrun.org/node/60 


ох APPENDIX II (CLIMATE IMPACTS) 

e Classification of Impacts: “The Nine Planetary Boundaries", Stockholm 
Resilience Center: http://www.stockholmresilience.org/2 | /research/research- 
programmes/planetary-boundaries/planetary-boundaries/about-the-research/the- 
nine-planetary-boundaries.html 


xi Impacts of Climate Change in the US 
e “Climate Change Indicators in the United States”, EPA (2012): 
http://www.epa.gov/climatechange/pdfs/climateindicators-full-2012.pdf; 
http://www.epa.gov/climatechange/science/indicators/ 


' Oceans and Climate Change 
e Ocean Systems, IPCC WG2 report (2014), Ch. 6: http://ipec- 
wg2.gov/AR5/images/uploads/WGIIARS5-Chap6 FGDall.pdf 


‘di Drought and Climate Change 
e Technical Summary (Impacts) WG2 ARS (2014): http://ipec- 
we2.gov/AR5/images/uploads/WGIIARS-TS_FGDall.pdf 


xiy Changing Precipitation and Flooding 
e Technical Summary (Impacts) WG2 ARS (2014): http://ipec- 
weg2.gov/AR5/images/uploads/WGIIARS-TS_FGDall.pdf 


e Technical Summary (Science) WG1 ARS (2013): 
http://www.climatechange2013.org/images/report/WGIARS_TS_FINAL.pdf 


*** Stronger Fires and Climate Change 
e Kelly, R. et al, “Recent burning of boreal forests exceeds fire regime limits of 
the past 10,000 years” (PNAS, 2013): 
http://m.pnas.org/content/early/2013/07/19/13050691 10 


e "Amid extreme drought, California sees big jump in brush fires”, L.A. Times, 
(2014): Attp://www.latimes.com/local/lanow/la-me-In-amid-extreme-drought- 
california-see-big-jump-in-brush-fires-20140723-story.html 
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Thompson, A. and Climate Central, “Lightning May Increase with Global 
Warming", Scientific American (11/13/2014): 
http://www.scientificamerican.com/article/lightning-may-increase-with-global- 
warming/ 


*** Hurricane Intensity, Extreme Event Attribution, and Climate Change 


Emanuel, K., “Increasing destructiveness of tropical cyclones over the past 
30 years”, Nature 436, 686-688 (8/4/2005): 
http://www.nature.com/nature/journal/v436/n705 1/abs/nature03906.html 


Herring, S. et al (editors), “Explaining Extreme Events of 2013 from a Climate 
Perspective". Special Supplement to Bull. Amer. Meteor. Soc., 95 (9) (2014): 
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e Regional Greenhouse Gas Initiative (RGGI): http://www.rggi.org/ 


^" The Clean Development Mechanism CDM 
e UNFCCC: http://cdm.unfccc.int/about/index.html 


xV Renewable Portfolio Standards 
e DSIRE: http://www.dsireusa.org/summarymaps/index.cfm?ee=1 &RE=1 


~~ Utilities, Renewable Energies, Power Grids 
e Тһе Green Power Network, U.S. Dept. of Energy (5/2012): 
http://apps3.eere.energy.gov/greenpower/markets/pricing.shtml?page-1 


e “Renewable Energy's Rise Hurting Utilities”, Governing the States and 
Localities (11/2014): http://www.governing.com/topics/transportation- 
infrastructure/gov-renewable-energy-rise-hurts-utilities.html 


e  Feed-in tariffs: http://en.m.wikipedia.org/wiki/Feed-in tariff 
e The Smart Power Grid: http://en.wikipedia.org/wiki/Smart grid 


“vl Chicago Climate Futures Exchange CCFE; Climate Derivatives (RGGI) 
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e The CCFE: http://www.ccfe.com/ccfeContent.jsf?id=9 1308 


e RGGI futures and options: 
http://www.ccfe.com/about ccfe/products/rggi/RGGI Futures and Options Ov 
erview.pdf 


awi Beyond Coal 
e Wikipedia: http://en.wikipedia.org/wiki/Beyond_Coal 


om" Natural Gas, Fracking, and Methane Leaks 
e “Methane Leak Rate Proves Key to Climate Change Goals", Scientific 
American (8/2014): http://www.scientificamerican.com/article/methane-leak- 
rate-proves-key-to-climate-change-goals/ 


e  "Gasland" (HBO film, 2010 Sundance Film Festival Award-Winner): 
http://www.hbo.com/documentaries/gasland/synopsis.html#/documentaries/gasl 
and/video/trailer.html/ 


x Fossil Fuel Subsidies 
e “U.S. Government Providing Billions in Fossil Fuel Subsidies to Unburnable 
Carbon", Oil Change International (2014): http://priceofoil.org/2014/07/09/u-s- 
government-providing-billions-fossil-fuel-subsidies-unburnable-carbon/ 
e “Reforming Energy Subsidies”, IMF (3/2013): 
http://www.imf.org/external/np/fad/subsidies/index.htm 


*! APPENDIX IV: CONTRARIANS - What do they say and why are they wrong? 

e "Climate Change — Addressing the Major Skeptic Arguments ", Deutsche Bank 
Climate Change Advisors (9/2010): 
http://www.uea.ac.uk/mac/comm/media/press/CRUstatements/otherreports/Deut 
sche+Bank+CRU+report 


e Realclimate “Wiki”: http://realclimate.org/wiki/index.php?title=RC_Wiki 
e SkepticalScience *Misinformers ": http://skepticalscience.com/misinformers.php 


е = “Global Warming Disinformation Database - An extensive database of 
individuals involved in the global warming denial industry”, deSmogBlog: 
http://www.desmogblog.com/global-warming-denier-database 


e = Nordhaus, W., “Why the Global Warming Skeptics Are Wrong", NY Rev. Books 
(3/22/2012): http://www.nybooks.com/articles/archives/20 12/mar/22/why- 
global-warming-skeptics-are-wrong/ 

e  Nuccitelli, D., “Global warming denial rears its ugly head around the world, in 
English", The Guardian (8/18/2014): 
http://www.theguardian.com/environment/climate-consensus-97-per- 
cent/2014/aug/18/global-warming-denial-rears-head-in-english 


e "Why is it called Denial?” - National Center for Science Education (1/5/2012): 
http://ncse.com/climate/denial/why-is-it-called-denial 


e "Myths vs. Facts: Global Warming", OSS: 
http://ossfoundation.us/projects/environment/global-warming/myths/revelle- 
gore-singer-lindzen/cosmos-myth/projects/environment/global-warming/myths 


e Souweine, D. "Google Drops ALEC Because: 'They're Just Literally Lying’ 
About Climate Change”, Huffington Post (9/23/2014): 
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http://www.huffingtonpost.com/daniel-souweine/google-drops-alec- 
because b 5865046.html 


Rive, N. et al, *Complaint to Ofcom Regarding "The Great Global Warming 
Swindle” (2007): http://www.ofcomswindlecomplaint.net/FullComplaint.pdf 


Suzuki, D., EcoWatch (2014): http://ecowatch.com/2014/08/05/global-warming- 
deniers-desperate/ 


“Leaked: Conservative Group Plans Anti-Climate Education Program", 
LiveScience (2/1/2012): http://www.livescience.com/18496-leaked-heartland- 
documents-climate-change.html 


“Faith groups divided over God's role in climate change, natural disasters", 
Washington Post (11/21/2014): 
http://www.washingtonpost.com/news/local/wp/2014/1 1/21/faith-groups- 
divided-over-gods-role-in-climate-change-natural-disasters/ 


Tashman, B., “James Inhofe Says the Bible Refutes Climate Change" - contains 
audio of Inhofe (3/8/2012): http://www.rightwingwatch.org/content/James- 
inhofe-says-bible-refutes-climate-change 


схі Science vs. the Contrarians 


Responses on Realclimate.org (2004): 
http://www.realclimate.org/index.php/archives/2004/12/index/#Responses 


Information and responses to contrarians on SkepticalScience.com: 
http://www.skepticalscience.com 


Pierrehumbert, R., “Climate Science is Settled Enough — the Wall Street 
Journal’s fresh face of climate inaction”, Slate (10/1/2014): 
http://www.slate.com/articles/health_and_science/science/2014/10/the_wall_stre 
et journal and steve koonin the new face of climate change.html 


Abraham, J., Cook, J., Fasullo, J., Jacobs, P., Mandia, S. and Nuccitelli, D., 
“Review of the consensus and asymmetric quality of research on human-induced 
climate change": http://mahb.stanford.edu/wp- 

content/uploads/2014/05/2014 Abraham-et-al.-Climate-consensus.pdf 


“Science Scorned”, Nature, 467, 133 (9/9/2010): 
http://www.nature.com/nature/journal/v467/n73 12/full/467133a.html 


Oreskes, N. and Conway, E., “Merchants of Doubt; How a Handful of Scientists 
Obscured the Truth on Issues from Tobacco Smoke to Global Warming”, 
Bloomsbury Press (2010); Naomi Oreskes responding to questions, LASCO-ELI 
Conference (part 2): https://www.youtube.com/watch?v-pV JOOOsggZ0 


Rahmstorf, S., "Climate sceptics confuse the public by focusing on short-term 
fluctuations”, Guardian (3/9/09): http://www.theguardian.com/environment/cif- 
green/2009/mar/09/climate-change-copenhagen 


Abraham, J., *Heartland Institute wastes real scientists' time — again", The 
Guardian (2013), http://www.theguardian.com/environment/climate-consensus- 
97-per-cent/2013/may/20/heartland-institute-scientists?view=mobile 


“Climate of Doubt”, Frontline (PBS documentary, 2012): 
http://www.pbs.org/wgbh/pages/frontline/climate-of-doubt/ 


“Dealing in Doubt: The Climate Denial Machine Vs. Climate Science — A brief 
history of attacks on climate science, climate scientists and the IPCC”, 
Greenpeace (2013): 
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http://www.greenpeace.org/usa/Global/usa/report/Dealing%20in%20Doubt%202013%20 
-%20Greenpeace%20report%200n%20Climate%20Change%20Denial%20Machine.pdf 


Cook, J., “The Scientific Guide to Global Warming Skepticism” (2010); 
http://www.skepticalscience.com/docs/Guide_to_Skepticism.pdf 


Powell, J., “The Inquisition of Climate Science”, Columbia U. Press (2011): 
http://scienceprogress.org/201 1/01/the-inquisition-of-climate-science/ 


Hoggan, J., Littlemore, R., “Climate Cover-up, the Crusade to deny Global 
Warming”, Greystone Books (2009): http://www.desmogblog.com/climate- 
cover-up 


Michaels, D., “Doubt is their Product, How Industry’s Assault on Science 
Threatens your Health", Oxford University Press (2008): 
http://en.wikipedia.org/wiki/Doubt_Is_ Their Product 


“What About the Contrarians? A Guide", UU-UNO Climate Portal: 
http://climate.uu-uno.org/topics/view/5 1 cbfc74f702fc2ba81292b2/ 


~li The Contrarian Media 


Feldman, L., Leiserowitz, A. et al, “The mutual reinforcement of media 
selectivity ... global warming.” J. Comm. (2014): 

http://environment. yale.edu/climate-communication/article/media-echo- 
chambers-and-climate-change/#sthash.dbHiHueG.dpuf 


Tyson, N. video (2014): http://mediamatters. org/mobile/video/2014/03/09/neil- 
degrasse-tyson-media-has-to-stop-giving-cl/198418 


Mandia, S., WSJ, (2011), http://profmandia.wordpress.com/201 1/01/31/wall- 
street-journal-selectively-pro-science/ 


Huertas, A. et al, “Science or Spin? Assessing the Accuracy of Cable News 
Coverage of Climate Science”, UCS (2014): 
http://www.ucsusa.org/sites/default/files/legacy/assets/documents/global_warmi 
ng/Science-or-Spin-report.pdf 


Roston, E., “Scientists Take Issue With Rupert Murdoch’s Remarks on Climate 
Change”, Bloomberg News (7/16/2014): 
http://mobile.bloomberg.com/news/2014-07-16/scientists-take-issue-with-rupert- 
murdoch-s-remarks-on-climate-change.html 


Heuvel, K., “The distorting reality of ‘false balance’ in the media", Washington 
Post (7/2014) http://www.washingtonpost.com/opinions/katrina-vanden-heuvel- 
the-distorting-reality-of-false-balance-in-the-media/2014/07/14/6def5706-0b8 1 - 
11e4-b8e5-d0de80767fc2_story.html 


Sinclair, P., *Climate-Denying Trolls Trained to Disrupt Internet", Climate 
Denial Crock of the Week (1/28/2011): 

http://climatecrocks.com/201 1/01/28/climate-denying-trolls-trained-to-disrupt- 
internet/ 


“Climate sceptics 'capture' the Bloggies' science category", Guardian (3/2013): 
http://www.theguardian.com/environment/blog/2013/mar/0 1/climate-sceptics- 
capture-bloggies-science 

“Revealed: Keystone company's PR blitz to safeguard its backup plan", 
Guardian (11/18/2014): 
http://www.theguardian.com/environment/2014/nov/18/revealed-keystone- 
companys-pr-blitz-to-safeguard-its-backup-plan 
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“liii Contrarian Psychological Influence 


Campbell, T. and Kay, A., “Solution aversion: On the relation between ideology 
and motivated disbelief”, Journal of Personality and Social Psychology, Vol 
107(5), pp. 809-824 (Nov. 2014): http://psycnet.apa.org/journals/psp/107/5/809/ 


Cook, J. and Lewandowsky, S., “The Debunking Handbook” (11/27/2011): 
http://www.skepticalscience.com/Debunking-Handbook-now-freely-available- 
download.html 


liv Connection of Climate Denial with Tobacco Risk Denial 


Oreskes, N. and Conway, E., “Merchants of Doubt” (supra) 


Bates, C. et al, “Tobacco Explained”, U.C. Report (1999), page 9: 
http://www.escholarship.org/uc/item/9fp6566b 


Sinclair, P. “Climate Denial Crock of the Week — Rhymes with Smokey Joe" 
video (11/25/2013): https://www.youtube.com/watch?v-DjOPY dl99tI 


«V Contrarian political obstruction to Climate Action (U.S.) 


“Waxman-Markey” or “American Clean Energy and Security Act of 2009”; (HR 
2454 — 111") vote tally: https://www.govtrack.us/congress/votes/1 1 1-2009/h477 


Mooney, C., "Study: Rich Republicans Are the Worst Climate Deniers”, Mother 
Jones (7/10/2014): http://m.motherjones.com/environment/2014/07/climate- 
denial-wealth-rich-republicans 


Krugman, P., “Interests, Ideology, and Climate", NY Times (6/8/2014): 
http://mobile.nytimes.com/2014/06/09/opinion/krugman-interests-ideology-and- 
climate.html 


Gore, A., “The Price of Climate Change Denial", Citilab interview (Fora TV, 
2014): http://www.dailymotion.com/video/x2dqqf4_al-gore-the-price-of- 
climate-change-denial_news 


“Call Out the Climate Change Deniers” (US Congress, with their own 
statements), Organizing for Action (2014): http://ofa.barackobama.com/climate- 
deniers/?source=socnet_blog CC 20140425 bo climate-deniers£/ 


Dickenson, T., “Inside the Koch Brothers’ Toxic Empire", Rolling Stone 
(9/24/2014): http://www.rollingstone.com/politics/news/inside-the-koch- 
brothers-toxic-empire-20140924 


“Anti-Evolution and Anti-Climate Science Legislation Scorecard: 2014”, 
National Center for Science Education (6/3/2014): http://ncse.com/climate- 
evolution/anti-evolution-anti-climate-science-legislation-scorecard 


vi Conservatives supporting mainstream Climate Science / Action (Examples) 


Prof. Katharine Hayhoe: http://katharinehayhoe.com/ 
Prof. Barry Bickmore: https://bbickmore.wordpress.com/about-this-blog/ 


Rosner, H., “The Conservative Case for a Carbon Tax (Rep. Bob Inglis)”, 
National Geographic (9/2014): 
http://news.nationalgeographic.com/news/2014/09/140922-carbon-tax-climate- 
change-conservatives-environment-science/ 


Pope Francis, *Pope's Message to UN Convention on Climate Change", 
(12/11/2014): https://www.salpointe.org/document.doc?1d-2967 
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Ruckelshaus, W., Thomas, L., Reilly, W., Whitman C. T., “А Republican Case 
for Climate Action" NY Times (8/1/2013): 
http://www.nytimes.com/2013/08/02/opinion/a-republican-case-for-climate- 
action.html? r-0 


Carbon Tax Center — compilation of Conservatives’ statements on carbon tax: 
http://www.carbontax.org/services/supporters/conservatives/ 


George Schultz, “Climate is Changing, and we need more action”, MIT News 
Office (10/1/2014): http://newsoffice.mit.edu/2014/george-shultz-climate- 
change-mit-talk-1001 


“Not All Republicans Think Alike About Global Warming”, Yale Project on 
Climate Change Communication (2015): http://environment.yale.edu/climate- 
communication/article/not-all-republicans-think-alike-about-global-warming/ 


“R Street responds to Intergovernmental Panel on Climate Change report", R 
Street (11/2014): http://www.rstreet.org/news-release/r-street-responds-to- 
intergovernmental-panel-on-climate-change-report/ 


Senator John McCain (previous record on climate action), Wikipedia: 
http://en.wikipedia.org/wiki/John McCain 


oxi Drisoner’s Dilemma 


Wikipedia: http://en.wikipedia.org/wiki/Prisoner?o027s dilemma 


«vii Funding for Climate Contrarian Organizations — From Where? 


Brulle, R: “Jnstitutionalizing delay: foundation funding and the creation of U.S. 
climate change counter-movement organizations”, Climate Change Vol 122, p. 
681 (2014): http://link.springer.com/article/10.1007%2Fs 10584-013-1018-7 


Mashey, J., “Study Details Dark Money Flowing to Climate Science Denial" 
(2013); “Fakery 2: More Funny Finances, Free Of Tax” (2012) - DeSmogBlog: 
http://www.desmogblog.com/2013/12/23/detailed-study-exposes-dark-money- 
flows-climate-science-denial; http://www.desmogblog.com/2012/10/23/fakery- 
2-more-funny-finances-free-tax 


Goldenberg, S, “Conservative groups spend up to $1bn a year to fight action on 
climate change", The Guardian (12/20/2013): 
http://www.theguardian.com/environment/20 13/dec/20/conservative-groups- 
1bn-against-climate-change 


“Donors Trust: The shadow operation that has laundered $146 million in 
climate denial funding ", Greenpeace (2013): 
http://www.greenpeace.org/usa/Global/usa/planet3/PDFs/DonorsTrust.pdf 


“Who is Donors Trust?", Desmogblog: http://www.desmogblog.com/who- 
donors-trust 


с Y untz Memo, “Communications Action Plan", Schatz Amendment 


Luntz Memo: http://en.wikipedia.org/wiki/Frank_Luntz#Global_ warming 


“Global Climate Science Communications Action Plan": 
http://www.curonet.nl/users/e_wesker/ew@shell/API-prop.html 


Schatz Amendment: Congressional Record - Senate $267 (1/20/15): 
http://www.gpo.gov/fdsys/pkg/CREC-2015-01-20/pdf/CREC-2015-01-20-pt1- 
PgS260.pdf#page=9 ; 
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http://www.senate.gov/legislative/LIS/roll_call_lists/roll_call_vote_cfm.cfm?co 
ngress=1 14&session=1 &vote=00012 


* Non-Expertise in Climate Science 
e  Anderegg, В. and Schneider, S., “Scientific expertise lacking among 'doubters' 
of climate change, Stanford U. Press Release (2010): 
http://news.stanford.edu/pr/2010/pr-climate-change-doubters-062510.html 


e Emanuel, К: “Climategate — A Different Perspective”, NAS (2010): 
http://www.nas.org/articles/Climategate A Different Perspective 


ci Misuse of the word “Proof” regarding climate change (typical example) 
e “Scott Brown says climate change is not proven", The Hill (8/25/2014): 
http://thehill.com/policy/energy-environment/2 15882-scott-brown-says-climate- 
change-is-not-proven 


‘ Contrarian Attacks on Scientists and Science; FOI Requests 
e = Mann, M. E., The Hockey Stick and the Climate Wars: Dispatches from the 
Front Lines, Columbia U. Press (2012); video: 
https://www.youtube.com/watch?feature-player detailpage&v-4jeFncatIko 


e “Freedom of information laws are used to harass scientists, says Nobel 
laureate”, The Guardian, (5/25/2011): 
http://www.theguardian.com/politics/201 1/may/25/freedom-information-laws- 
harass-scientists 


d Scientists respond to contrarian attacks 
e Members of the U.S. National Academy of Sciences: Gleick, P. et al, “Climate 
Change and the Integrity of Science”, Science Vol. 328 no. 5979 (5/7/2010): 
http://www.sciencemag.org/cgi/content/full/328/5979/689 


e Krugman, P. “The Empiricist Strikes Back" NY Times (8/10/2014): 
http://mobile.nytimes.com/blogs/krugman/2014/08/10/the-empiricist-strikes- 
back/ 


e Revkin, A., “A Legal Defense Fund for Climate Scientists ", Dot Earth, NY 


Times (1/25/2012): http://dotearth.blogs.nytimes.com/2012/01/25/a-legal- 
defense-fund-for-climate-scientists/? r=0 


e Climate Science Legal Defense Fund: http://climatesciencedefensefund.org/ 


e “Climate Scientist Sues National Post for Libel”, McConchie Law Corporation, 
Market Wired (4/21/2010): http://www.marketwired.com/press-release/climate- 
scientist-sues-national-post-for-libel-1151667.htm 


VIE Fringe Climate Science (definition, example — cosmic rays and climate change) 
e Fringe Science definition: http://en.wikipedia.org/wiki/Fringe science 


e “А review of cosmic rays and climate: a cluttered story of little success", 
Realclimate (2012): http://www.realclimate.org/index.php/archives/2012/12/a- 
review-of-cosmic-rays-and-climate-a-cluttered-story-of-little-success/ 


V Feynman's “Cargo-Cult Science" talk at Cal Tech 
e Feynman, R. P., Engineering and Science (1974): 
http://calteches.library.caltech.edu/51/2/CargoCult.pdf 


cvi Errors in the UAH Analysis of MSU Satellite Data and Related Topics 
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e  Realclimate (2005): http://www.realclimate.org/index.php/archives/2005/08/et- 
tu-It/; http://www.realclimate.org/index.php/archives/2005/1 1/more-satellite- 
stuff/ ; http://www.realclimate.org/index.php?p=179 


e  Pierrehumbert, R. (AGU talk, supra), start at 34:40 


e Fu, О. and Johanson, C., “Satellite-derived vertical dependence of tropical 
tropospheric temperature trends”, Geophysical Research Letters, Vol. 32, 
(2005): http://www.atmos.washington.edu/-qfu/Publications/grl.fu.2005.pdf 


e “Upper Air Temperature", Remote Sensing Systems (RSS): 
http://www.remss.com/measurements/upper-air-temperature 


cvi The BEST (Berkeley Earth Surface Temperature) Study 
e Wiki: http://en.wikipedia.org/wiki/Berkeley EarthZJuly 2012 announcement 


e Muller, R., "The Conversion of a Climate-Change Skeptic”, NY Times (2012): 
http://www.nytimes.com/2012/07/30/opinion/the-conversion-of-a-climate- 
change-skeptic.html? 


cvii The One-liner Responses to over 100 Contrarian Scientific Fallacies 
e "Global Warming and Climate Change Myths ", SkepticalScience: 
http://www.skepticalscience.com/argument.php 


e (Соок, J. “Rebutting skeptic arguments in a single line" (7/20/2010): 
http://skepticalscience.com/Rebutting-skeptic-arguments-in-a-single-line.html 


clix “Conversations” including a few Climate Science/Risk Contrarians/Deniers 
e Wilmott Off-topic Forum (search Author/JWD for the “Off Topic" Forum): 
http://www.wilmott.com/search.cfm 


?* Skeptical Science Logic/Science Fallacy List 
e "Climate Myth Fallacies” (МООС): http://skepticalscience.com/fallacies.shtml 


e Example: *Snowmageddon", CBS News (2010): 
http://www.cbsnews.com/news/conservatives-use-snowmageddon-to-mock- 
global-warming/ 


i Contrarian Conspiracy Fallacies 
e "Global Warming Conspiracy Theory ": 
http://en.m.wikipedia.org/wiki/Global warming conspiracy theory 


e  "Antigovernment Conspiracy Theorists Rail Against UN's Agenda 21 
Program”, Southern Poverty Law Center (2012): http://www.splcenter.org/get- 
informed/intelligence-report/browse-all-issues/2012/spring/behind-the-green- 
mask 


‘i Contrarian attacks on the “Hockey Stick”; Medieval Warm Period fallacy 
e “Myths vs. facts regarding the hockey stick’, RealClimate (2004): 
http://www. realclimate.org/index.php/archives/2004/12/myths-vs-fact- 
regarding-the-hockey-stick/ 


e Medieval Warm Period Fallacy (RealClimate, 2004): 
http://www.realclimate.org/index.php/archives/2004/12/werent-temperatures- 
warmer-during-the-medieval-warm-period-than-they-are-today/ 


ei The False Dilemma of Poverty Amelioration vs. Climate Mitigation 
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e Ғор, К. vs. Lomborg, B., “The use of discount rates in "Copenhagen Consensus 
2008; Debate in “Frederiksborg Amts Avis ”” (2008): http://www.lomborg- 
errors.dk/Faadebate2008.htm 


e  Nuccitelli, D., “Climate change could impact the poor much more than 
previously thought", Guardian 1/26/2015): 
http://www.theguardian.com/environment/climate-consensus-97-per- 
cent/2015/jan/26/climate-change-could-impact-poor-much-more-than- 
previously-thought 


e “The 67 People As Wealthy As The World's Poorest 3.5 Billion", Forbes 
(3/25/2014): http://www.forbes.com/sites/forbesinsights/20 1 4/03/25/the-67- 
people-as-wealthy-as-the-worlds-poorest-3-5-billion/ 


cxiv «Climategate" and Subsequent Investigations Clearing Scientists 
e Emanuel, K: “Climategate — A Different Perspective”, NAS (2010): 
http://www.nas.org/articles/Climategate A Different Perspective 


e Climategate Investigations — Conclusions from Union of Concerned Scientists: 
http://www.ucsusa.org/global warming/solutions/fight- 
misinformation/debunking-misinformation-stolen-emails- 
climategate.htmlZ. VCUzJOfKzze 


e Wiki: http://en.wikipedia.org/wiki/Climatic Research Unit email controversy 


e Example of using the word “trick” in science - the Replica Trick: 
http://en.wikipedia.org/wiki/Replica trick 


у Contrarian Attacks on the IPCC 
e “IPCC errors: facts and spin", RealClimate (2010): 
http://www.realclimate.org/index.php/archives/2010/02/1pcc-errors-facts-and- 
spin/ 
e Bickmore, B, “Climate Asylum” (blog, 2011): 
http://bbickmore.wordpress.com/201 1/01/19/marc moranos ipcc dissenters/ 


cxvi Contrarians Underestimate Risk, Overestimate Mitigation Costs (example) 
e Sourcewatch: http://www.sourcewatch.org/index.php?title-Bjorn Lomborg 


‘vii The Precautionary Principle and the Rio Declaration, ignored by Contrarians 
e Wikipedia: http://en.wikipedia.org/wiki/Precautionary principle£Formulations 


e Wikipedia:http://www.unep.org/Documents.multilingual/Default.asp? Document 
ID-78&ArticleID-1 163 


хуй Neanderthals — An Adaptation Failure 
e Wikipedia: http://en.wikipedia.org/wiki/Neanderthal 


cxix Contrarian Opposition to Renewable Energy 
e “ALEC calls for penalties on 'freerider' homeowners in assault on clean 
energy", Guardian (2013): http://www.theguardian.com/world/2013/dec/04/alec- 
freerider-homeowners-assault-clean-energy 


e Foster, J., “Koch-Funded Group Won't Back Kansas Republicans Who 
Supported Clean Energy”, Think Progress (6/17/2014): 
http://thinkprogress.org/climate/2014/06/17/3449826/rps-gop-support-kansas- 
koch/? 
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